489:. It continues to support the goals outlined by CiteSeer to actively crawl and harvest academic and scientific documents on the public web and to use a citation inquiry by citations and ranking of documents by the impact of citations. Currently, Lee Giles, Prasenjit Mitra, Susan Gauch, Min-Yen Kan, Pradeep Teregowda, Juan Pablo Fernández Ramírez, Pucktada Treeratpituk, Jian Wu, Douglas Jordan, Steve Carman, Jack Carroll, Jim Jansen, and Shuyi Zheng are or have been actively involved in its development. Recently, a table search feature was introduced. It has been funded by the
1068:
195:
36:
565:
CiteSeer crawls publicly available scholarly documents primarily from author webpages and other open resources, and does not have access to publisher metadata. As such, citation counts in CiteSeer are usually less than those in Google
Scholar and Microsoft Academic Search who have access to publisher
441:
CiteSeer had not been comprehensively updated since 2005 due to limitations in its architecture design. It had a representative sampling of research documents in computer and information science but was limited in coverage because it was limited to papers that are publicly available, usually at an
434:. However, these versions of CiteSeer proved difficult to maintain and are no longer available. Because CiteSeer only indexes freely available papers on the web and does not have access to publisher metadata, it returns fewer citation counts than sites, such as
556:
tools, usually built on machine learning methods such ParsCit, to extract scholarly document metadata such as title, authors, abstract, citations, etc. As such, there are sometime errors in authors and titles. Other academic search engines have similar errors.
332:. CiteSeer-like engines and archives usually only harvest documents from publicly available websites and do not crawl publisher websites. For this reason, authors whose documents are freely available are more likely to be represented in the index.
473:. However, recently CiteSeer has been expanding into other scholarly domains such as economics, physics and others. Released in 2008, it was loosely based on the previous CiteSeer search engine and digital library and is built with a new
966:
The document with the identifier "10.1.1.604.4916" has been removed due to a DMCA takedown notice. If you believe the removal has been in error, please contact us through the feedback page, along with the identifier mentioned in this
504:
CiteSeer continues to be rated as one of the world's top repositories, and was rated number 1 in July 2010. It currently has over 6 million documents with nearly 6 million unique authors and 120 million citations.
411:", on September 11, 2001. The patent was filed on May 20, 1998, and has priority to January 5, 1998. A continuation patent (US Patent # 6738780) was filed on May 16, 2001, and granted on May 18, 2004.
286:
CiteSeer's goal is to improve the dissemination and access of academic and scientific literature. As a non-profit service that can be freely used by anyone, it has been considered part of the
396:
Citation context showed the context of citations to a given paper, allowing a researcher to quickly and easily see what other researchers have to say about an article of interest.
574:
CiteSeer has nearly one million users worldwide based on unique IP addresses and has millions of hits daily. Annual downloads of document PDFs were nearly 200 million for 2015.
442:
author's homepage, or those submitted by an author. To overcome some of these limitations, a modular and open source architecture for CiteSeer was designed – CiteSeer.
482:
984:
1083:
422:, and had over 700,000 documents. For enhanced access, performance and research, similar versions of CiteSeer were supported at universities such as the
399:
Related documents were shown using citation and word based measures, and an active and continuously updated bibliography is shown for each document.
532:
and open source tools, which allows it to be a testbed for new algorithms in document harvesting, ranking, indexing, and information extraction.
947:
697:
676:
630:
626:
649:
896:
1113:
477:
infrastructure, SeerSuite, and new algorithms and their implementations. It was developed by researchers Isaac
Councill and C.
17:
390:
Citation statistics and related documents were computed for all articles cited in the database, not just the indexed articles.
383:
CiteSeer became public in 1998 and had many new features unavailable in academic search engines at that time. These included:
1035:
778:
761:
Kodakateri
Pudhiyaveetil, Ajith; Gauch, Susan; Luong, Hiep; Eno, Josh (2009). "Conceptual recommender system for CiteSeerX".
423:
387:
Autonomous
Citation Indexing automatically created a citation index that can be used for literature search and evaluation.
1123:
357:
372:, US. CiteSeer's goal was to actively crawl and harvest academic and scientific documents on the web and use autonomous
100:
1118:
980:
921:
871:
850:
736:
72:
819:
119:
1093:
802:
Lawrence, Steve (2001). "ResearchIndex: Inside the world's largest free full-text index of scientific literature".
431:
629:. However, these were not maintained by their sponsors. An older version of both of these could be once found at
79:
57:
1008:
Giles, C. Lee; Bollacker, Kurt D.; Lawrence, Steve (1998). "CiteSeer: an automatic citation indexing system".
1108:
486:
419:
161:
707:
86:
490:
301:
of all indexed documents and links indexed documents when possible to other sources of metadata such as
529:
53:
68:
1103:
508:
CiteSeer also shares its software, data, databases and metadata with other researchers, currently by
329:
268:
961:
956:
1088:
1018:
598:
583:
365:
318:
295:
287:
252:
687:
602:
462:
46:
427:
1013:
951:
553:
361:
143:
369:
1098:
606:
474:
586:
with researchers worldwide and has been and is used in many experiments and competitions.
8:
470:
291:
280:
1067:
194:
1041:
900:
825:
784:
702:
498:
93:
1031:
815:
774:
829:
788:
453:
replaced CiteSeer and all queries to CiteSeer were redirected. CiteSeer is a public
1023:
807:
766:
713:
535:
CiteSeer caches some PDF files that it has scanned. As such, each page includes a
466:
276:
1045:
644:
and for archaeology, ArchSeer. Another had been built for robots.txt file search,
621:
The CiteSeer model had been extended to cover academic documents in business with
335:
CiteSeer changed its name to
ResearchIndex at one point and then changed it back.
516:. Its new modular open source architecture and software (available previously on
458:
377:
272:
692:
435:
415:
373:
353:
325:
875:
1077:
804:
Proceedings of the international conference on
Knowledge capture - K-CAP 2001
636:
Other Seer-like search and repository systems have been built for chemistry,
454:
770:
925:
846:
740:
594:
409:
Autonomous citation indexing and literature browsing using citation context
294:
to allow greater access to scientific literature. CiteSeer freely provided
1027:
811:
393:
Reference linking, allowing browsing of the database using citation links.
622:
525:
517:
666:
306:
324:
CiteSeer is considered a predecessor of academic search tools such as
760:
637:
610:
509:
478:
349:
310:
981:"Using OAI-PMH as a Single Record Level Query Interface to Citeseer"
35:
27:
Search engine and digital library for scientific and academic papers
298:
1061:
645:
590:
183:
169:
653:
521:
404:
376:
to permit querying by citation or by document, ranking them by
275:
for scientific and academic papers, primarily in the fields of
763:
Proceedings of the third ACM conference on
Recommender systems
465:
for scientific and academic papers, primarily with a focus on
671:
513:
1010:
Proceedings of the Third ACM Conference on
Digital Libraries
681:
536:
494:
418:
at the
College of Information Sciences and Technology, The
302:
897:"Ranking Web of World Repositories: Top 800 Repositories"
414:
After NEC, in 2004 it was hosted as CiteSeer.IST on the
616:
539:
link which can be used to report copyright violations.
1007:
317:
shares its data for non-commercial purposes under a
547:
60:. Unsourced material may be challenged and removed.
765:. New York, New York, US: ACM Press. p. 241.
483:the College of Information Sciences and Technology
648:. All of these are built on the open source tool
1075:
924:. Pennsylvania State University. Archived from
874:. Pennsylvania State University. Archived from
164:College of Information Sciences and Technology
899:. Cybermetrics Lab. July 2010. Archived from
698:List of academic databases and search engines
677:Collection of Computer Science Bibliographies
380:. At one point, it was called ResearchIndex.
841:
839:
731:
729:
684:(Digital Bibliography & Library Project)
343:
1084:Bibliographic databases in computer science
1066:
582:CiteSeer data is regularly shared under a
193:
1017:
955:
836:
726:
120:Learn how and when to remove this message
801:
14:
1076:
290:movement that is attempting to change
978:
652:, which uses the open source indexer
424:Massachusetts Institute of Technology
403:CiteSeer was granted a United States
617:Other SeerSuite-based search engines
348:CiteSeer was created by researchers
58:adding citations to reliable sources
29:
597:and its content is indexed like an
560:
542:
24:
1001:
292:academic and scientific publishing
25:
1135:
1053:
584:Creative Commons BY-NC-SA license
253:Creative Commons BY-NC-SA license
548:Automated information extraction
438:, that have publisher metadata.
432:National University of Singapore
34:
987:from the original on 2020-11-24
853:from the original on 2010-07-22
360:in 1997 while they were at the
45:needs additional citations for
972:
939:
914:
889:
864:
795:
754:
13:
1:
1114:Pennsylvania State University
720:
633:but is no longer in service.
487:Pennsylvania State University
420:Pennsylvania State University
162:Pennsylvania State University
708:Research Papers in Economics
445:
7:
659:
491:National Science Foundation
226:; 27 years ago
215:; 16 years ago
10:
1140:
1124:American digital libraries
979:Hirst, Tony (2011-12-08).
593:endpoint, CiteSeerX is an
338:
1119:Scholarly search services
948:"CiteSeerx – DMCA Notice"
344:CiteSeer and CiteSeer.IST
330:Microsoft Academic Search
246:
238:
209:
201:
178:
168:
157:
149:
137:
599:institutional repository
569:
552:CiteSeer uses automated
407:# 6289342, titled "
319:Creative Commons license
296:Open Archives Initiative
1094:Internet search engines
771:10.1145/1639714.1639758
737:"CiteSeerX Data Policy"
688:Disciplinary repository
625:and in e-business with
603:academic search engines
577:
922:"About CiteSeerX Data"
554:information extraction
362:NEC Research Institute
144:Bibliographic database
18:CiteSeerX (identifier)
1028:10.1145/276675.276685
812:10.1145/500737.500740
370:Princeton, New Jersey
1109:Open-access archives
872:"The CiteSeerX Team"
428:University of Zürich
54:improve this article
471:information science
281:information science
239:Current status
134:
1012:. pp. 89–98.
703:Microsoft Academic
499:Microsoft Research
132:
1037:978-0-89791-965-4
847:"About CiteSeerX"
780:978-1-60558-435-5
374:citation indexing
263:(formerly called
258:
257:
150:Available in
130:
129:
122:
104:
16:(Redirected from
1131:
1104:Online databases
1070:
1065:
1064:
1062:Official website
1049:
1021:
996:
995:
993:
992:
976:
970:
969:
960:. Archived from
959:
943:
937:
936:
934:
933:
918:
912:
911:
909:
908:
893:
887:
886:
884:
883:
868:
862:
861:
859:
858:
843:
834:
833:
799:
793:
792:
758:
752:
751:
749:
748:
739:. Archived from
733:
714:Semantic Scholar
561:Focused crawling
543:Current features
234:
232:
227:
223:
221:
216:
197:
192:
189:
187:
185:
135:
131:
125:
118:
114:
111:
105:
103:
62:
38:
30:
21:
1139:
1138:
1134:
1133:
1132:
1130:
1129:
1128:
1089:Eprint archives
1074:
1073:
1060:
1059:
1056:
1038:
1004:
1002:Further reading
999:
990:
988:
977:
973:
964:on 2022-03-18.
957:10.1.1.604.4916
946:
944:
940:
931:
929:
920:
919:
915:
906:
904:
895:
894:
890:
881:
879:
870:
869:
865:
856:
854:
845:
844:
837:
822:
800:
796:
781:
759:
755:
746:
744:
735:
734:
727:
723:
718:
662:
641:
619:
605:, for instance
580:
572:
563:
550:
545:
459:digital library
448:
378:citation impact
346:
341:
273:digital library
249:
248:Content license
230:
228:
225:
219:
217:
214:
182:
140:
126:
115:
109:
106:
63:
61:
51:
39:
28:
23:
22:
15:
12:
11:
5:
1137:
1127:
1126:
1121:
1116:
1111:
1106:
1101:
1096:
1091:
1086:
1072:
1071:
1055:
1054:External links
1052:
1051:
1050:
1036:
1019:10.1.1.30.6847
1003:
1000:
998:
997:
971:
938:
913:
888:
863:
835:
820:
794:
779:
753:
724:
722:
719:
717:
716:
711:
705:
700:
695:
693:Google Scholar
690:
685:
679:
674:
669:
663:
661:
658:
639:
618:
615:
589:Thanks to its
579:
576:
571:
568:
562:
559:
549:
546:
544:
541:
524:) is built on
447:
444:
436:Google Scholar
416:World Wide Web
401:
400:
397:
394:
391:
388:
358:Steve Lawrence
354:Kurt Bollacker
345:
342:
340:
337:
326:Google Scholar
267:) is a public
256:
255:
250:
247:
244:
243:
240:
236:
235:
211:
207:
206:
203:
199:
198:
180:
176:
175:
172:
166:
165:
159:
155:
154:
151:
147:
146:
141:
138:
128:
127:
42:
40:
33:
26:
9:
6:
4:
3:
2:
1136:
1125:
1122:
1120:
1117:
1115:
1112:
1110:
1107:
1105:
1102:
1100:
1097:
1095:
1092:
1090:
1087:
1085:
1082:
1081:
1079:
1069:
1063:
1058:
1057:
1047:
1043:
1039:
1033:
1029:
1025:
1020:
1015:
1011:
1006:
1005:
986:
982:
975:
968:
963:
958:
953:
949:
945:For example,
942:
928:on 2012-01-05
927:
923:
917:
903:on 2010-07-24
902:
898:
892:
878:on 2018-07-26
877:
873:
867:
852:
848:
842:
840:
831:
827:
823:
821:1-58113-380-4
817:
813:
809:
806:. p. 3.
805:
798:
790:
786:
782:
776:
772:
768:
764:
757:
743:on 2012-01-05
742:
738:
732:
730:
725:
715:
712:
709:
706:
704:
701:
699:
696:
694:
691:
689:
686:
683:
680:
678:
675:
673:
670:
668:
665:
664:
657:
655:
651:
647:
643:
634:
632:
628:
624:
614:
612:
608:
604:
600:
596:
592:
587:
585:
575:
567:
558:
555:
540:
538:
533:
531:
527:
523:
519:
515:
511:
506:
502:
500:
496:
492:
488:
484:
480:
476:
472:
468:
464:
460:
456:
455:search engine
452:
443:
439:
437:
433:
429:
425:
421:
417:
412:
410:
406:
398:
395:
392:
389:
386:
385:
384:
381:
379:
375:
371:
367:
363:
359:
355:
351:
336:
333:
331:
327:
322:
320:
316:
312:
309:. To promote
308:
304:
300:
297:
293:
289:
284:
282:
278:
274:
270:
269:search engine
266:
262:
254:
251:
245:
241:
237:
212:
208:
204:
200:
196:
191:
181:
177:
173:
171:
167:
163:
160:
156:
152:
148:
145:
142:
136:
124:
121:
113:
102:
99:
95:
92:
88:
85:
81:
78:
74:
71: –
70:
66:
65:Find sources:
59:
55:
49:
48:
43:This article
41:
37:
32:
31:
19:
1009:
989:. Retrieved
974:
965:
962:the original
941:
930:. Retrieved
926:the original
916:
905:. Retrieved
901:the original
891:
880:. Retrieved
876:the original
866:
855:. Retrieved
803:
797:
762:
756:
745:. Retrieved
741:the original
635:
620:
595:open archive
588:
581:
573:
564:
551:
534:
507:
503:
450:
449:
440:
413:
408:
402:
382:
347:
334:
323:
314:
285:
264:
260:
259:
202:Registration
139:Type of site
116:
110:January 2015
107:
97:
90:
83:
76:
64:
52:Please help
47:verification
44:
1099:Library 2.0
631:BizSeer.IST
623:SmealSearch
613:consumers.
526:Apache Solr
520:but now on
518:SourceForge
475:open source
288:open access
69:"CiteSeerX"
1078:Categories
991:2020-04-25
932:2012-01-25
907:2010-07-24
882:2018-05-01
857:2010-05-07
747:2015-11-10
721:References
667:Arnetminer
627:eBizSearch
566:metadata.
528:and other
463:repository
307:ACM Portal
80:newspapers
1014:CiteSeerX
952:CiteSeerX
650:SeerSuite
611:Unpaywall
510:Amazon S3
479:Lee Giles
350:Lee Giles
311:open data
184:citeseerx
985:Archived
851:Archived
830:19592721
789:13900679
660:See also
467:computer
451:CiteSeer
446:CiteSeer
430:and the
366:NEC Labs
315:CiteSeer
305:and the
299:metadata
277:computer
265:CiteSeer
261:CiteSeer
210:Launched
205:Optional
133:CiteSeer
710:(RePEc)
646:BotSeer
591:OAI-PMH
512:and by
339:History
229: (
218: (
170:Revenue
153:Español
94:scholar
1046:514080
1044:
1034:
1016:
954:
828:
818:
787:
777:
654:Lucene
530:Apache
522:GitHub
497:, and
405:patent
242:Active
224:/ 1997
174:Active
96:
89:
82:
75:
67:
1042:S2CID
967:page.
826:S2CID
785:S2CID
672:arXiv
570:Usage
514:rsync
364:(now
158:Owner
101:JSTOR
87:books
1032:ISBN
816:ISBN
775:ISBN
682:DBLP
642:Seer
638:Chem
609:and
607:BASE
578:Data
537:DMCA
495:NASA
469:and
461:and
457:and
356:and
328:and
303:DBLP
279:and
271:and
231:1997
220:2008
213:2008
190:.edu
188:.psu
186:.ist
73:news
1024:doi
808:doi
767:doi
601:in
481:at
368:),
179:URL
56:by
1080::
1040:.
1030:.
1022:.
983:.
950:.
849:.
838:^
824:.
814:.
783:.
773:.
728:^
656:.
501:.
493:,
485:,
426:,
352:,
321:.
313:,
283:.
1048:.
1026::
994:.
935:.
910:.
885:.
860:.
832:.
810::
791:.
769::
750:.
640:X
233:)
222:)
123:)
117:(
112:)
108:(
98:·
91:·
84:·
77:·
50:.
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.