TY - GEN
T1 - Selectivity estimation for exclusive query translation in deep web data integration
AU - Jiang, Fangjiao
AU - Meng, Weiyi
AU - Meng, Xiaofeng
PY - 2009
Y1 - 2009
N2 - In Deep Web data integration, some Web database interfaces express exclusive predicates of the form Q e = Pi(Pi∈ P1, P2, . . . , Pm), which permits only one predicate to be selected at a time. Accurately and efficiently estimating the selectivity of each Q e is of critical importance to optimal query translation. In this paper, we mainly focus on the selectivity estimation on infinite-value attribute which is more difficult than that on key attribute and categorical attribute. Firstly, we compute the attribute correlation and retrieve approximate random attribute-level samples through submitting queries on the least correlative attribute to the actual Web database. Then we estimate Zipf equation based on the word rank of the sample and the actual selectivity of several words from the actual Web database. Finally, the selectivity of any word on the infinite-value attribute can be derived by the Zipf equation. An experimental evaluation of the proposed selectivity estimation method is provided and experimental results are highly accurate.
AB - In Deep Web data integration, some Web database interfaces express exclusive predicates of the form Q e = Pi(Pi∈ P1, P2, . . . , Pm), which permits only one predicate to be selected at a time. Accurately and efficiently estimating the selectivity of each Q e is of critical importance to optimal query translation. In this paper, we mainly focus on the selectivity estimation on infinite-value attribute which is more difficult than that on key attribute and categorical attribute. Firstly, we compute the attribute correlation and retrieve approximate random attribute-level samples through submitting queries on the least correlative attribute to the actual Web database. Then we estimate Zipf equation based on the word rank of the sample and the actual selectivity of several words from the actual Web database. Finally, the selectivity of any word on the infinite-value attribute can be derived by the Zipf equation. An experimental evaluation of the proposed selectivity estimation method is provided and experimental results are highly accurate.
UR - https://www.scopus.com/pages/publications/67650114848
U2 - 10.1007/978-3-642-00887-0_53
DO - 10.1007/978-3-642-00887-0_53
M3 - Conference contribution
SN - 9783642008863
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 595
EP - 600
BT - Database Systems for Advanced Applications - 14th International Conference, DASFAA 2009, Proceedings
T2 - 14th International Conference on Database Systems for Advanced Applications, DASFAA 2009
Y2 - 21 April 2009 through 23 April 2009
ER -