The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
본 논문에서는 웹 스패머에 의해 링크가 하이재킹된 웹 사이트를 찾는 방법을 제안합니다. 하이재킹된 사이트는 신뢰할 수 없는 사이트를 가리키는 신뢰할 수 있는 사이트입니다. 하이재킹된 사이트를 탐지하기 위해 웹사이트의 신뢰성을 평가하고, 신뢰할 수 있는 사이트가 외부 이웃의 신뢰할 수 없는 사이트에 의해 어떻게 하이재킹되는지 조사합니다. PageRank의 두 가지 수정 버전에서 계산한 화이트 점수와 스팸 점수의 차이를 기반으로 신뢰성을 평가합니다. 우리는 외부 이웃의 신뢰도 분포를 기반으로 신뢰할 수 있는 사이트가 하이재킹될 가능성을 측정하는 두 가지 하이재킹 점수를 정의합니다. 하이재킹된 점수의 성능은 당사의 대규모 일본어 웹 아카이브를 사용하여 비교됩니다. 그 결과, 신뢰할 수 없는 외부 이웃만 고려한 점수에 비해 신뢰할 수 있는 외부 이웃과 신뢰할 수 없는 외부 이웃을 모두 고려한 점수가 더 나은 성능을 나타내는 것으로 나타났습니다.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Young-joo CHUNG, Masashi TOYODA, Masaru KITSUREGAWA, "Detecting Hijacked Sites by Web Spammer Using Link-Based Algorithms" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 6, pp. 1414-1421, June 2010, doi: 10.1587/transinf.E93.D.1414.
Abstract: In this paper, we propose a method for finding web sites whose links are hijacked by web spammers. A hijacked site is a trustworthy site that points to untrustworthy sites. To detect hijacked sites, we evaluate the trustworthiness of web sites, and examine how trustworthy sites are hijacked by untrustworthy sites in their out-neighbors. The trustworthiness is evaluated based on the difference between the white and spam scores that calculated by two modified versions of PageRank. We define two hijacked scores that measure how likely a trustworthy site is to be hijacked based on the distribution of the trustworthiness in its out-neighbors. The performance of those hijacked scores are compared using our large-scale Japanese Web archive. The results show that a better performance is obtained by the score that considers both trustworthy and untrustworthy out-neighbors, compared with the one that only considers untrustworthy out-neighbors.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.1414/_p
부
@ARTICLE{e93-d_6_1414,
author={Young-joo CHUNG, Masashi TOYODA, Masaru KITSUREGAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Detecting Hijacked Sites by Web Spammer Using Link-Based Algorithms},
year={2010},
volume={E93-D},
number={6},
pages={1414-1421},
abstract={In this paper, we propose a method for finding web sites whose links are hijacked by web spammers. A hijacked site is a trustworthy site that points to untrustworthy sites. To detect hijacked sites, we evaluate the trustworthiness of web sites, and examine how trustworthy sites are hijacked by untrustworthy sites in their out-neighbors. The trustworthiness is evaluated based on the difference between the white and spam scores that calculated by two modified versions of PageRank. We define two hijacked scores that measure how likely a trustworthy site is to be hijacked based on the distribution of the trustworthiness in its out-neighbors. The performance of those hijacked scores are compared using our large-scale Japanese Web archive. The results show that a better performance is obtained by the score that considers both trustworthy and untrustworthy out-neighbors, compared with the one that only considers untrustworthy out-neighbors.},
keywords={},
doi={10.1587/transinf.E93.D.1414},
ISSN={1745-1361},
month={June},}
부
TY - JOUR
TI - Detecting Hijacked Sites by Web Spammer Using Link-Based Algorithms
T2 - IEICE TRANSACTIONS on Information
SP - 1414
EP - 1421
AU - Young-joo CHUNG
AU - Masashi TOYODA
AU - Masaru KITSUREGAWA
PY - 2010
DO - 10.1587/transinf.E93.D.1414
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2010
AB - In this paper, we propose a method for finding web sites whose links are hijacked by web spammers. A hijacked site is a trustworthy site that points to untrustworthy sites. To detect hijacked sites, we evaluate the trustworthiness of web sites, and examine how trustworthy sites are hijacked by untrustworthy sites in their out-neighbors. The trustworthiness is evaluated based on the difference between the white and spam scores that calculated by two modified versions of PageRank. We define two hijacked scores that measure how likely a trustworthy site is to be hijacked based on the distribution of the trustworthiness in its out-neighbors. The performance of those hijacked scores are compared using our large-scale Japanese Web archive. The results show that a better performance is obtained by the score that considers both trustworthy and untrustworthy out-neighbors, compared with the one that only considers untrustworthy out-neighbors.
ER -