Syntactic clustering of the Web
A. Broder, S. Glassman, M. Manasse, and G. Zweig.
Computer Networks and ISDN Systems 29 (8-13): 1157--1166 (September 1997)

We have developed an efficient way to determine the syntactic similarity of files and have applied it to every document on the World Wide Web. Using this mechanism, we built a clustering of all the documents that are syntactically similar. Possible applications include a "Lost and Found" service, filtering the results of Web searches, updating widely distributed web-pages, and identifying violations of intellectual property rights.

URL

http://www.sciencedirect.com/science/article/B6TYT-3SP60S4-11/2/38f44c816ec8d69b406317de1629e56d

search on

This publication has not been reviewed yet.

rating distribution

average user rating0.0 out of 5.0 based on 0 reviews

Please log in to take part in the discussion (add own reviews or comments).

@article{keyhere,
  abstract = {We have developed an efficient way to determine the syntactic similarity of files and have applied it to every document on the World Wide Web. Using this mechanism, we built a clustering of all the documents that are syntactically similar. Possible applications include a "Lost and Found" service, filtering the results of Web searches, updating widely distributed web-pages, and identifying violations of intellectual property rights.},
  added-at = {2007-11-23T14:11:12.000+0100},
  author = {Broder, Andrei Z. and Glassman, Steven C. and Manasse, Mark S. and Zweig, Geoffrey},
  biburl = {https://puma.uni-kassel.de/bibtex/293a3440b81c13ec81c17481a97719c71/hotho},
  booktitle = {Papers from the Sixth International World Wide Web Conference},
  description = {ScienceDirect - Computer Networks and ISDN Systems : Syntactic clustering of the Web},
  interhash = {424cdc36335873e4d8c0bed6e07e872e},
  intrahash = {93a3440b81c13ec81c17481a97719c71},
  journal = {Computer Networks and ISDN Systems},
  keywords = {Duplication Fingerprints Resemblance Signatures Similarity Web search},
  month = {#sep#},
  number = {8-13},
  pages = {1157--1166},
  timestamp = {2007-11-23T14:11:12.000+0100},
  title = {Syntactic clustering of the Web},
  url = {http://www.sciencedirect.com/science/article/B6TYT-3SP60S4-11/2/38f44c816ec8d69b406317de1629e56d},
  volume = 29,
  year = 1997
}

%0 Journal Article
%1 keyhere
%A Broder, Andrei Z.
%A Glassman, Steven C.
%A Manasse, Mark S.
%A Zweig, Geoffrey
%B Papers from the Sixth International World Wide Web Conference
%D 1997
%J Computer Networks and ISDN Systems
%K Duplication Fingerprints Resemblance Signatures Similarity Web search
%N 8-13
%P 1157--1166
%T Syntactic clustering of the Web
%U http://www.sciencedirect.com/science/article/B6TYT-3SP60S4-11/2/38f44c816ec8d69b406317de1629e56d
%V 29
%X We have developed an efficient way to determine the syntactic similarity of files and have applied it to every document on the World Wide Web. Using this mechanism, we built a clustering of all the documents that are syntactically similar. Possible applications include a "Lost and Found" service, filtering the results of Web searches, updating widely distributed web-pages, and identifying violations of intellectual property rights.

PUMA

Syntactic clustering of the Web
A. Broder, S. Glassman, M. Manasse, and G. Zweig.
Computer Networks and ISDN Systems 29 (8-13): 1157--1166 (September 1997)

Tags

Users

Comments and Reviews

Cite this publication

PUMA

Syntactic clustering of the WebA. Broder, S. Glassman, M. Manasse, and G. Zweig. Computer Networks and ISDN Systems 29 (8-13): 1157--1166 (September 1997)

Tags

Users

Comments and Reviews

Cite this publication

Syntactic clustering of the Web
A. Broder, S. Glassman, M. Manasse, and G. Zweig.
Computer Networks and ISDN Systems 29 (8-13): 1157--1166 (September 1997)