Here I propose a Perl script which does this automatically using the search engine Inspire, which is the standard search engine for high-energy physics and related fields.
Of course I do not believe that it is very relevant or useful to know whether an article is published and in which journal, but journals typically require such data to be displayed in bibliographies. In any case, the decision to display publication data or not is done at the level of the bibliography style file -- the Bibtex file itself is only a database, which should be as complete as possible.
So what does the script do? the script takes a Bibtex file, and for each entry which has no publication data, but an eprint field with an Arxiv number, a request is sent to Inspire. If publication data are found, then the entry is modified to include them. (The rest of the entry, and in particular the key, are not changed.)
For example, with the following Bibtex file called test1.bib,
@article{cer12,
author = "Chekhov, Leonid and Eynard, Bertrand and Ribault,
Sylvain",
title = "{Seiberg-Witten equations and non-commutative spectral curves in Liouville theory}",
year = "2012",
eprint = "1209.3984",
archivePrefix = "arXiv",
primaryClass = "hep-th",
reportNumber = "IPHT-T12-075",
SLACcitation = "%%CITATION = ARXIV:1209.3984;%%",
}
@Article{fr10,
author = "Fateev, Vladimir and Ribault, Sylvain",
title = "{Conformal Toda theory with a boundary}",
journal = "JHEP",
volume = "12",
year = "2010",
pages = "089",
eprint = "1007.1293",
archivePrefix = "arXiv",
primaryClass = "hep-th",
doi = "10.1007/JHEP12(2010)089",
SLACcitation = "%%CITATION = 1007.1293;%%"
}
@Article{rib03b,the script runs as follows:
author = "Ribault, Sylvain",
title = "Strings and D-branes in curved space-times. (In French)",
year = "2003",
eprint = "hep-th/0309272",
SLACcitation = "%%CITATION = HEP-TH 0309272;%%"
}
bibpup.pl -e test1.bibAfter that, the Bibtex file looks like this:
Using Inspire for the entry cer12,
Using Inspire for the entry rib03b,
No valid update found in Inspire for the labels: rib03b,
Labels whose entries were updated: cer12,
@article{cer12,The script is given at the end of this post. To install, save it in the bin directory, and make sure it is executable
author = "Chekhov, Leonid and Eynard, Bertrand and Ribault,
Sylvain",
title = "{Seiberg-Witten equations and non-commutative spectral curves in Liouville theory}",
journal = "J.Math.Phys.",
volume = "54",
pages = "022306",
doi = "10.1063/1.4792241",
year = "2013",
eprint = "1209.3984",
archivePrefix = "arXiv",
primaryClass = "hep-th",
reportNumber = "IPHT-T12-075",
SLACcitation = "%%CITATION = ARXIV:1209.3984;%%",
}
@Article{fr10,
author = "Fateev, Vladimir and Ribault, Sylvain",
title = "{Conformal Toda theory with a boundary}",
journal = "JHEP",
volume = "12",
year = "2010",
pages = "089",
eprint = "1007.1293",
archivePrefix = "arXiv",
primaryClass = "hep-th",
doi = "10.1007/JHEP12(2010)089",
SLACcitation = "%%CITATION = 1007.1293;%%"
}
@Article{rib03b,
author = "Ribault, Sylvain",
title = "Strings and D-branes in curved space-times. (In French)",
year = "2003",
eprint = "hep-th/0309272",
SLACcitation = "%%CITATION = HEP-TH 0309272;%%"
}
chmod +x bibpup.plIt is possible to write a default name for the Bibtex file into the script, in which case the command for executing is simply
bibpup.plSo here is the script:
eval '(exit ?0)' && eval 'exec perl -x -S0 {1+"$@"}' && eval 'exec perl -x -S0 argv:q' if 0; #!/usr/local/bin/perl -w use strict; use File::Copy; #Note: using & at the end of unix commands called with 'system' sometimes produces #strange results, namely later commands being ignored! #Settings mymyname = "bibpup.pl"; #name of this program
my versionnum=′1.01′;myversion_details = "mynameversion_num, by Sylvain Ribault";
my tmpfile = "tmp.bib"; #name of the file where Inspire data are written mythecmd = "wget -q -O tmpfile"; #the command mytheurl = "http://inspirehep.net/search?ln=en&p="; #relevant URL
my beginsearch = "find+bb+"; #search command, as suggested by the source of the Inspire webpage myendsearch = "&of=hx&action_search=Search"; #end of the search command
my outfield = "journal"; #BIBTEX field we want to update myinfield = "eprint"; #BIBTEX field used for finding articles
my entryend = "^\}"; #string used to detect the end of a BIBTEX entry my @replacefield = ("year", "eprint", "SLACcitation"); #BIBTEX fields where updating can start mybibdefault = "test1.bib"; #default BIB file to be updated
#Global variables
my bibfile = ""; #BIB file to be updated myreadfile = ""; #Same file, but now read-only
my bakfile = ""; #name of the backup copy mydo_help = 0;
my doversion=0;mydo_enter = 0;
#========================================================================
bibfile=bibdefault;
while (=ARGV[0]) {
if ( /^-/ ){
if ( /h/ ){
do_help = 1; } if ( /e/ ){ $do_enter = 1; } if ( /v/ ){ $do_version = 1; } } elsif (do_enter == 0 ){
if ( /\S/ ){
print "myname: \"Superfluous arguments. I ignore them and proceed.\"\n"; } last; } else { $bibfile = $ARGV[0]; last; } shift; } if (do_version == 1 ){
print "Running version_details\n"; } if (do_help == 1 ){
print "myname: \"Just printing help:\"\n"; print_help(); exit 0; }bakfile = "bibfile.bak";rename(bibfile,bakfile)ordie"Cannotcreatebackupfile\"bakfile\": file \"bibfile\"maynotexist\n";readfile = "bibfile.read";copy(bakfile,readfile)ordie"Cannotcreatereadingfile\"readfile\": file \"bakfile\"maynotexist\n";updatebib();unlinkreadfile;
#unlink bakfile; #unlinktmpfile;
exit 0;
#=========================================================================
sub updatebib{
my entry = 0; #will become 1 if we find a BIB entry, 2 after #replacefield, 3 after #outfield myreplace = 0; #will become 1 if entry must be modified
my searchitem = ""; #value ofinfield, used for finding articles
my label = ""; #label of articles myspirestalk = "";
my @labels = ();
my @cleaned_labels =();
my @dirty_labels =();
my begin_entry; #text of an entry up toreplacefield not included
my full_entry; #text of an entry local *IN; local *OUT; open( IN, "<readfile" )
or die "Cannot read \"readfile\"\n";open(OUT,">bibfile" )
or die "Cannot write on \"bibfile\"\n"; while (<IN>) { my $line = $_; if ( /^@\w+\{(\S+)/ ){ # print "Entry found! \"$1\"\n"; # own label for the article $label = $1; push @labels , $label; $searchitem = ""; $begin_entry = ""; $full_entry = ""; $entry = 1; $replace = 0; } $full_entry = $full_entry.$line; if ( $entry != 0 ){ if ( /$outfield/ ){ $entry = 3; } if ( /$infield[^\"]+\"([^\"]+)\"/ ){ $searchitem = $1; # print "$searchitem\n"; # eprint number } for (my $i = 0; $i <= $#replacefield; $i++ ) { if ( /$replacefield[$i]/ && $entry == 1 ){ $entry = 2; } } } if ( $line =~ /$entryend/ && $entry != 0 && $entry != 3 && $searchitem ne "" ){ print "Using Inspire for the entry $label\n"; # print "$searchitem\n"; # eprint number $spirestalk = getoutfield($searchitem); # print "What we found: $spirestalk\n"; # Journal data from Inspire if ( $spirestalk eq "" ){ push @dirty_labels , $label; } else { push @cleaned_labels , $label; $replace = 1; } $searchitem = ""; } if ( $entry == 0 ){ print OUT $line; } if ( $line =~ /$entryend/ && $entry != 0 ){ $entry = 0; if ( $replace == 1 ){ $replace = 0; print OUT $begin_entry; print OUT $spirestalk; } else { print OUT $full_entry; } } if ( $entry == 1 ){ $begin_entry = $begin_entry.$line; } } print "No valid update found in Inspire for the labels: "; for (myi = 0; i<=#dirty_labels; i++)print"$dirtylabels[$i]";print"\n";print"Labelswhoseentrieswereupdated:";for(myi = 0; i<=#cleaned_labels; i++ ) { print "cleaned_labels[i] "; } print "\n"; close IN; close OUT; } #==================================================================== sub getoutfield{ # Given alookitem (arXive number), gets the outfield(journalref)fromInspire.mylookitem = [0];myoutresult = "";
my fullcmd=thecmd." \"".theurl.beginsearch.lookitem.endsearch."\"";
my inentry = 0; # print "fullcmd\n"; # search command given to Inspire
system(fullcmd)==0ordie"\"systemfullcmd failed: ?\"";local∗IN;open(IN,"<tmpfile" )
or die "Cannot read \"tmpfile\"\n"; while (<IN>) { my $theline = $_; if ( /^@\w+\{(\S+)/ ){ $inentry = 1; # print "Inspire label: $1\n"; # Inspire label for that article } if ( $inentry == 1 && /$outfield/ ){ $inentry = 2; } if ( $inentry == 2 ){ $outresult = $outresult.$theline; } if ( /$entryend/ ){ $inentry = 0; } } close IN; unlinktmpfile;
# print "Just found with Inspire: outresult\n";returnoutresult;
}
#==================================================================
sub print_help{
my replacelist="";for(myi = 0; i<=#replacefield; i++)$replacelist=$replacelist."′$replacefield[$i]′,";print<<HELP;Usage:myname [options] [file]
where it is necessary to enter a file name only if option -e is present,
otherwise default file name is used.
Browses the BIBTEX file, looking for entries with no 'outfield′.Ifan′infield' is nevertheless given, uses it to search Inspire for
possible 'outfield′data.Ifdataarefound,replacesthepartoftheBIBTEXentrysubsequenttoafieldreplacelist
with the new entry from Inspire subsequent to '$outfield'. In particular
this preserves the label. The original file is saved as a '.bak' file.
Options:
-v indicates -version number.
-h prints the present -help and dies.
-e expects names of the file to be -entered explicitly.
HELP
}
Nice. great Article Thanks..
ReplyDelete