Detail publikace

Measuring Web Page Similarity Based on Textual and Visual Properties

BARTÍK, V.

Originální název

Measuring Web Page Similarity Based on Textual and Visual Properties

Anglický název

Measuring Web Page Similarity Based on Textual and Visual Properties

Jazyk

en

Originální abstrakt

Measuring web page similarity is a very important task in the area of web mining and information retrieval. This paper introduces the method for measuring web page similarity, which considers both textual and visual properties of pages. Textual properties of a page are described by means of modified weight vector space model. General visual properties are captured via segmentation of a page, which divides a page into visual blocks, properties of which are stored into a vector of visual properties. These both vectors are then used to compute the whole web page similarity. This method will be described in detail and results of several experiments are also introduced in this paper.

Anglický abstrakt

Measuring web page similarity is a very important task in the area of web mining and information retrieval. This paper introduces the method for measuring web page similarity, which considers both textual and visual properties of pages. Textual properties of a page are described by means of modified weight vector space model. General visual properties are captured via segmentation of a page, which divides a page into visual blocks, properties of which are stored into a vector of visual properties. These both vectors are then used to compute the whole web page similarity. This method will be described in detail and results of several experiments are also introduced in this paper.

Dokumenty

BibTex


@inproceedings{BUT76500,
  author="Vladimír {Bartík}",
  title="Measuring Web Page Similarity Based on Textual and Visual Properties",
  annote="Measuring web page similarity is a very important task in the area of web mining
and information retrieval. This paper introduces the method for measuring web
page similarity, which considers both textual and visual properties of pages.
Textual properties of a page are described by means of modified weight vector
space model. General visual properties are captured via segmentation of a page,
which divides a page into visual blocks, properties of which are stored into
a vector of visual properties. These both vectors are then used to compute the
whole web page similarity. This method will be described in detail and results of
several experiments are also introduced in this paper.",
  address="Springer Verlag",
  booktitle="The 11th International Conference on Artificial Intelligence and Soft Computing",
  chapter="76500",
  edition="Lecture Notes in Artificial Intelligence, Vol. 7268",
  howpublished="print",
  institution="Springer Verlag",
  number="7268",
  year="2012",
  month="may",
  pages="13--21",
  publisher="Springer Verlag",
  type="conference paper"
}