Knowledge

CiteSeerX

Source 📝

489:. It continues to support the goals outlined by CiteSeer to actively crawl and harvest academic and scientific documents on the public web and to use a citation inquiry by citations and ranking of documents by the impact of citations. Currently, Lee Giles, Prasenjit Mitra, Susan Gauch, Min-Yen Kan, Pradeep Teregowda, Juan Pablo Fernández Ramírez, Pucktada Treeratpituk, Jian Wu, Douglas Jordan, Steve Carman, Jack Carroll, Jim Jansen, and Shuyi Zheng are or have been actively involved in its development. Recently, a table search feature was introduced. It has been funded by the 1068: 195: 36: 565:
CiteSeer crawls publicly available scholarly documents primarily from author webpages and other open resources, and does not have access to publisher metadata. As such, citation counts in CiteSeer are usually less than those in Google Scholar and Microsoft Academic Search who have access to publisher
441:
CiteSeer had not been comprehensively updated since 2005 due to limitations in its architecture design. It had a representative sampling of research documents in computer and information science but was limited in coverage because it was limited to papers that are publicly available, usually at an
434:. However, these versions of CiteSeer proved difficult to maintain and are no longer available. Because CiteSeer only indexes freely available papers on the web and does not have access to publisher metadata, it returns fewer citation counts than sites, such as 556:
tools, usually built on machine learning methods such ParsCit, to extract scholarly document metadata such as title, authors, abstract, citations, etc. As such, there are sometime errors in authors and titles. Other academic search engines have similar errors.
332:. CiteSeer-like engines and archives usually only harvest documents from publicly available websites and do not crawl publisher websites. For this reason, authors whose documents are freely available are more likely to be represented in the index. 473:. However, recently CiteSeer has been expanding into other scholarly domains such as economics, physics and others. Released in 2008, it was loosely based on the previous CiteSeer search engine and digital library and is built with a new 966:
The document with the identifier "10.1.1.604.4916" has been removed due to a DMCA takedown notice. If you believe the removal has been in error, please contact us through the feedback page, along with the identifier mentioned in this
504:
CiteSeer continues to be rated as one of the world's top repositories, and was rated number 1 in July 2010. It currently has over 6 million documents with nearly 6 million unique authors and 120 million citations.
411:", on September 11, 2001. The patent was filed on May 20, 1998, and has priority to January 5, 1998. A continuation patent (US Patent # 6738780) was filed on May 16, 2001, and granted on May 18, 2004. 286:
CiteSeer's goal is to improve the dissemination and access of academic and scientific literature. As a non-profit service that can be freely used by anyone, it has been considered part of the
396:
Citation context showed the context of citations to a given paper, allowing a researcher to quickly and easily see what other researchers have to say about an article of interest.
574:
CiteSeer has nearly one million users worldwide based on unique IP addresses and has millions of hits daily. Annual downloads of document PDFs were nearly 200 million for 2015.
442:
author's homepage, or those submitted by an author. To overcome some of these limitations, a modular and open source architecture for CiteSeer was designed – CiteSeer.
482: 984: 1083: 422:, and had over 700,000 documents. For enhanced access, performance and research, similar versions of CiteSeer were supported at universities such as the 399:
Related documents were shown using citation and word based measures, and an active and continuously updated bibliography is shown for each document.
532:
and open source tools, which allows it to be a testbed for new algorithms in document harvesting, ranking, indexing, and information extraction.
947: 697: 676: 630: 626: 649: 896: 1113: 477:
infrastructure, SeerSuite, and new algorithms and their implementations. It was developed by researchers Isaac Councill and C.
17: 390:
Citation statistics and related documents were computed for all articles cited in the database, not just the indexed articles.
383:
CiteSeer became public in 1998 and had many new features unavailable in academic search engines at that time. These included:
1035: 778: 761:
Kodakateri Pudhiyaveetil, Ajith; Gauch, Susan; Luong, Hiep; Eno, Josh (2009). "Conceptual recommender system for CiteSeerX".
423: 387:
Autonomous Citation Indexing automatically created a citation index that can be used for literature search and evaluation.
1123: 357: 372:, US. CiteSeer's goal was to actively crawl and harvest academic and scientific documents on the web and use autonomous 100: 1118: 980: 921: 871: 850: 736: 72: 819: 119: 1093: 802:
Lawrence, Steve (2001). "ResearchIndex: Inside the world's largest free full-text index of scientific literature".
431: 629:. However, these were not maintained by their sponsors. An older version of both of these could be once found at 79: 57: 1008:
Giles, C. Lee; Bollacker, Kurt D.; Lawrence, Steve (1998). "CiteSeer: an automatic citation indexing system".
1108: 486: 419: 161: 707: 86: 490: 301:
of all indexed documents and links indexed documents when possible to other sources of metadata such as
529: 53: 68: 1103: 508:
CiteSeer also shares its software, data, databases and metadata with other researchers, currently by
329: 268: 961: 956: 1088: 1018: 598: 583: 365: 318: 295: 287: 252: 687: 602: 462: 46: 427: 1013: 951: 553: 361: 143: 369: 1098: 606: 474: 586:
with researchers worldwide and has been and is used in many experiments and competitions.
8: 470: 291: 280: 1067: 194: 1041: 900: 825: 784: 702: 498: 93: 1031: 815: 774: 829: 788: 453:
replaced CiteSeer and all queries to CiteSeer were redirected. CiteSeer is a public
1023: 807: 766: 713: 535:
CiteSeer caches some PDF files that it has scanned. As such, each page includes a
466: 276: 1045: 644:
and for archaeology, ArchSeer. Another had been built for robots.txt file search,
621:
The CiteSeer model had been extended to cover academic documents in business with
335:
CiteSeer changed its name to ResearchIndex at one point and then changed it back.
516:. Its new modular open source architecture and software (available previously on 458: 377: 272: 692: 435: 415: 373: 353: 325: 875: 1077: 804:
Proceedings of the international conference on Knowledge capture - K-CAP 2001
636:
Other Seer-like search and repository systems have been built for chemistry,
454: 770: 925: 846: 740: 594: 409:
Autonomous citation indexing and literature browsing using citation context
294:
to allow greater access to scientific literature. CiteSeer freely provided
1027: 811: 393:
Reference linking, allowing browsing of the database using citation links.
622: 525: 517: 666: 306: 324:
CiteSeer is considered a predecessor of academic search tools such as
760: 637: 610: 509: 478: 349: 310: 981:"Using OAI-PMH as a Single Record Level Query Interface to Citeseer" 35: 27:
Search engine and digital library for scientific and academic papers
298: 1061: 645: 590: 183: 169: 653: 521: 404: 376:
to permit querying by citation or by document, ranking them by
275:
for scientific and academic papers, primarily in the fields of
763:
Proceedings of the third ACM conference on Recommender systems
465:
for scientific and academic papers, primarily with a focus on
671: 513: 1010:
Proceedings of the Third ACM Conference on Digital Libraries
681: 536: 494: 418:
at the College of Information Sciences and Technology, The
302: 897:"Ranking Web of World Repositories: Top 800 Repositories" 414:
After NEC, in 2004 it was hosted as CiteSeer.IST on the
616: 539:
link which can be used to report copyright violations.
1007: 317:
shares its data for non-commercial purposes under a
547: 60:. Unsourced material may be challenged and removed. 765:. New York, New York, US: ACM Press. p. 241. 483:the College of Information Sciences and Technology 648:. All of these are built on the open source tool 1075: 924:. Pennsylvania State University. Archived from 874:. Pennsylvania State University. Archived from 164:College of Information Sciences and Technology 899:. Cybermetrics Lab. July 2010. Archived from 698:List of academic databases and search engines 677:Collection of Computer Science Bibliographies 380:. At one point, it was called ResearchIndex. 841: 839: 731: 729: 684:(Digital Bibliography & Library Project) 343: 1084:Bibliographic databases in computer science 1066: 582:CiteSeer data is regularly shared under a 193: 1017: 955: 836: 726: 120:Learn how and when to remove this message 801: 14: 1076: 290:movement that is attempting to change 978: 652:, which uses the open source indexer 424:Massachusetts Institute of Technology 403:CiteSeer was granted a United States 617:Other SeerSuite-based search engines 348:CiteSeer was created by researchers 58:adding citations to reliable sources 29: 597:and its content is indexed like an 560: 542: 24: 1001: 292:academic and scientific publishing 25: 1135: 1053: 584:Creative Commons BY-NC-SA license 253:Creative Commons BY-NC-SA license 548:Automated information extraction 438:, that have publisher metadata. 432:National University of Singapore 34: 987:from the original on 2020-11-24 853:from the original on 2010-07-22 360:in 1997 while they were at the 45:needs additional citations for 972: 939: 914: 889: 864: 795: 754: 13: 1: 1114:Pennsylvania State University 720: 633:but is no longer in service. 487:Pennsylvania State University 420:Pennsylvania State University 162:Pennsylvania State University 708:Research Papers in Economics 445: 7: 659: 491:National Science Foundation 226:; 27 years ago 215:; 16 years ago 10: 1140: 1124:American digital libraries 979:Hirst, Tony (2011-12-08). 593:endpoint, CiteSeerX is an 338: 1119:Scholarly search services 948:"CiteSeerx – DMCA Notice" 344:CiteSeer and CiteSeer.IST 330:Microsoft Academic Search 246: 238: 209: 201: 178: 168: 157: 149: 137: 599:institutional repository 569: 552:CiteSeer uses automated 407:# 6289342, titled " 319:Creative Commons license 296:Open Archives Initiative 1094:Internet search engines 771:10.1145/1639714.1639758 737:"CiteSeerX Data Policy" 688:Disciplinary repository 625:and in e-business with 603:academic search engines 577: 922:"About CiteSeerX Data" 554:information extraction 362:NEC Research Institute 144:Bibliographic database 18:CiteSeerX (identifier) 1028:10.1145/276675.276685 812:10.1145/500737.500740 370:Princeton, New Jersey 1109:Open-access archives 872:"The CiteSeerX Team" 428:University of Zürich 54:improve this article 471:information science 281:information science 239:Current status 134: 1012:. pp. 89–98. 703:Microsoft Academic 499:Microsoft Research 132: 1037:978-0-89791-965-4 847:"About CiteSeerX" 780:978-1-60558-435-5 374:citation indexing 263:(formerly called 258: 257: 150:Available in 130: 129: 122: 104: 16:(Redirected from 1131: 1104:Online databases 1070: 1065: 1064: 1062:Official website 1049: 1021: 996: 995: 993: 992: 976: 970: 969: 960:. Archived from 959: 943: 937: 936: 934: 933: 918: 912: 911: 909: 908: 893: 887: 886: 884: 883: 868: 862: 861: 859: 858: 843: 834: 833: 799: 793: 792: 758: 752: 751: 749: 748: 739:. Archived from 733: 714:Semantic Scholar 561:Focused crawling 543:Current features 234: 232: 227: 223: 221: 216: 197: 192: 189: 187: 185: 135: 131: 125: 118: 114: 111: 105: 103: 62: 38: 30: 21: 1139: 1138: 1134: 1133: 1132: 1130: 1129: 1128: 1089:Eprint archives 1074: 1073: 1060: 1059: 1056: 1038: 1004: 1002:Further reading 999: 990: 988: 977: 973: 964:on 2022-03-18. 957:10.1.1.604.4916 946: 944: 940: 931: 929: 920: 919: 915: 906: 904: 895: 894: 890: 881: 879: 870: 869: 865: 856: 854: 845: 844: 837: 822: 800: 796: 781: 759: 755: 746: 744: 735: 734: 727: 723: 718: 662: 641: 619: 605:, for instance 580: 572: 563: 550: 545: 459:digital library 448: 378:citation impact 346: 341: 273:digital library 249: 248:Content license 230: 228: 225: 219: 217: 214: 182: 140: 126: 115: 109: 106: 63: 61: 51: 39: 28: 23: 22: 15: 12: 11: 5: 1137: 1127: 1126: 1121: 1116: 1111: 1106: 1101: 1096: 1091: 1086: 1072: 1071: 1055: 1054:External links 1052: 1051: 1050: 1036: 1019:10.1.1.30.6847 1003: 1000: 998: 997: 971: 938: 913: 888: 863: 835: 820: 794: 779: 753: 724: 722: 719: 717: 716: 711: 705: 700: 695: 693:Google Scholar 690: 685: 679: 674: 669: 663: 661: 658: 639: 618: 615: 589:Thanks to its 579: 576: 571: 568: 562: 559: 549: 546: 544: 541: 524:) is built on 447: 444: 436:Google Scholar 416:World Wide Web 401: 400: 397: 394: 391: 388: 358:Steve Lawrence 354:Kurt Bollacker 345: 342: 340: 337: 326:Google Scholar 267:) is a public 256: 255: 250: 247: 244: 243: 240: 236: 235: 211: 207: 206: 203: 199: 198: 180: 176: 175: 172: 166: 165: 159: 155: 154: 151: 147: 146: 141: 138: 128: 127: 42: 40: 33: 26: 9: 6: 4: 3: 2: 1136: 1125: 1122: 1120: 1117: 1115: 1112: 1110: 1107: 1105: 1102: 1100: 1097: 1095: 1092: 1090: 1087: 1085: 1082: 1081: 1079: 1069: 1063: 1058: 1057: 1047: 1043: 1039: 1033: 1029: 1025: 1020: 1015: 1011: 1006: 1005: 986: 982: 975: 968: 963: 958: 953: 949: 945:For example, 942: 928:on 2012-01-05 927: 923: 917: 903:on 2010-07-24 902: 898: 892: 878:on 2018-07-26 877: 873: 867: 852: 848: 842: 840: 831: 827: 823: 821:1-58113-380-4 817: 813: 809: 806:. p. 3. 805: 798: 790: 786: 782: 776: 772: 768: 764: 757: 743:on 2012-01-05 742: 738: 732: 730: 725: 715: 712: 709: 706: 704: 701: 699: 696: 694: 691: 689: 686: 683: 680: 678: 675: 673: 670: 668: 665: 664: 657: 655: 651: 647: 643: 634: 632: 628: 624: 614: 612: 608: 604: 600: 596: 592: 587: 585: 575: 567: 558: 555: 540: 538: 533: 531: 527: 523: 519: 515: 511: 506: 502: 500: 496: 492: 488: 484: 480: 476: 472: 468: 464: 460: 456: 455:search engine 452: 443: 439: 437: 433: 429: 425: 421: 417: 412: 410: 406: 398: 395: 392: 389: 386: 385: 384: 381: 379: 375: 371: 367: 363: 359: 355: 351: 336: 333: 331: 327: 322: 320: 316: 312: 309:. To promote 308: 304: 300: 297: 293: 289: 284: 282: 278: 274: 270: 269:search engine 266: 262: 254: 251: 245: 241: 237: 212: 208: 204: 200: 196: 191: 181: 177: 173: 171: 167: 163: 160: 156: 152: 148: 145: 142: 136: 124: 121: 113: 102: 99: 95: 92: 88: 85: 81: 78: 74: 71: –  70: 66: 65:Find sources: 59: 55: 49: 48: 43:This article 41: 37: 32: 31: 19: 1009: 989:. Retrieved 974: 965: 962:the original 941: 930:. Retrieved 926:the original 916: 905:. Retrieved 901:the original 891: 880:. Retrieved 876:the original 866: 855:. Retrieved 803: 797: 762: 756: 745:. Retrieved 741:the original 635: 620: 595:open archive 588: 581: 573: 564: 551: 534: 507: 503: 450: 449: 440: 413: 408: 402: 382: 347: 334: 323: 314: 285: 264: 260: 259: 202:Registration 139:Type of site 116: 110:January 2015 107: 97: 90: 83: 76: 64: 52:Please help 47:verification 44: 1099:Library 2.0 631:BizSeer.IST 623:SmealSearch 613:consumers. 526:Apache Solr 520:but now on 518:SourceForge 475:open source 288:open access 69:"CiteSeerX" 1078:Categories 991:2020-04-25 932:2012-01-25 907:2010-07-24 882:2018-05-01 857:2010-05-07 747:2015-11-10 721:References 667:Arnetminer 627:eBizSearch 566:metadata. 528:and other 463:repository 307:ACM Portal 80:newspapers 1014:CiteSeerX 952:CiteSeerX 650:SeerSuite 611:Unpaywall 510:Amazon S3 479:Lee Giles 350:Lee Giles 311:open data 184:citeseerx 985:Archived 851:Archived 830:19592721 789:13900679 660:See also 467:computer 451:CiteSeer 446:CiteSeer 430:and the 366:NEC Labs 315:CiteSeer 305:and the 299:metadata 277:computer 265:CiteSeer 261:CiteSeer 210:Launched 205:Optional 133:CiteSeer 710:(RePEc) 646:BotSeer 591:OAI-PMH 512:and by 339:History 229: ( 218: ( 170:Revenue 153:Español 94:scholar 1046:514080 1044:  1034:  1016:  954:  828:  818:  787:  777:  654:Lucene 530:Apache 522:GitHub 497:, and 405:patent 242:Active 224:/ 1997 174:Active 96:  89:  82:  75:  67:  1042:S2CID 967:page. 826:S2CID 785:S2CID 672:arXiv 570:Usage 514:rsync 364:(now 158:Owner 101:JSTOR 87:books 1032:ISBN 816:ISBN 775:ISBN 682:DBLP 642:Seer 638:Chem 609:and 607:BASE 578:Data 537:DMCA 495:NASA 469:and 461:and 457:and 356:and 328:and 303:DBLP 279:and 271:and 231:1997 220:2008 213:2008 190:.edu 188:.psu 186:.ist 73:news 1024:doi 808:doi 767:doi 601:in 481:at 368:), 179:URL 56:by 1080:: 1040:. 1030:. 1022:. 983:. 950:. 849:. 838:^ 824:. 814:. 783:. 773:. 728:^ 656:. 501:. 493:, 485:, 426:, 352:, 321:. 313:, 283:. 1048:. 1026:: 994:. 935:. 910:. 885:. 860:. 832:. 810:: 791:. 769:: 750:. 640:X 233:) 222:) 123:) 117:( 112:) 108:( 98:· 91:· 84:· 77:· 50:. 20:)

Index

CiteSeerX (identifier)

verification
improve this article
adding citations to reliable sources
"CiteSeerX"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
Bibliographic database
Pennsylvania State University
Revenue
citeseerx.ist.psu.edu
Edit this at Wikidata
Creative Commons BY-NC-SA license
search engine
digital library
computer
information science
open access
academic and scientific publishing
Open Archives Initiative
metadata
DBLP
ACM Portal
open data
Creative Commons license

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.