Amber: A Debuggable Dataflow System Based on the Actor Model, Avinash Kumar, Zuozhi Wang, Shengquan Ni, and Chen Li, VLDB 2020.
Similarity query support in big data management systems, Taewoo Kim, Wenhai Li, Alexander Behm, Inci Cetindil, Rares Vernica, Vinayak Borkar, Michael J. Carey, Chen Li, Information Systems, Volume 88, February 2020. [Paper]
Use of Twitter data to improve Zika virus surveillance in the United States during the 2016 epidemic, Shahir Masri, Jianfeng Jia, Chen Li, Guofa Zhou, Ming-Chieh Lee, Guiyun Yan and Jun Wu, BMC Public Health. [Paper]
Inves: Incremental Partitioning-Based Verification for Graph Similarity Search, Jongik Kim, Dong-Hoon Choi, and Chen Li, EDBT 2019.[Paper][PPT]
Visually Analyzing A Billion Tweets: An Application for Collaborative Visual Analytics on Large High-Resolution Display, Simon Su, Michael An, Vincent Perry, Jianfeng Jia, Taewoo Kim, Te-Yu Chen, and Chen Li, Workshop on Analysis of Large-scale Disparate Data. BigData 2018: 3597-3606. [Paper]
Heatflip: Temporal-Spatial Sampling for Progressive Heat Maps on Social Media Data, Niklas Stoehr, Johannes Meyer, Volker Markl, Qiushi Bai, Taewoo Kim, De-Yu Chen, Chen Li. Workshop on Big Social Media Data Management and Analysis (BSMDMA2018), BigData 2018: 3723-3732. [Paper]
ZigZag: Supporting Similarity Queries on Vector Space Models, Wenhai Li, Lingfeng Deng, Yang Li, and Chen Li, SIGMOD 2018. [Paper] [Talk] [Poster]
End-to-End Machine Learning with Apache AsterixDB, Wail Alkowaileet, Sattam Alsubaiee, Michael J. Carey, Chen Li, Heri Ramampiaro, Phanwadee Sinthong, Xikui Wang, DEEM Workshop 2018 (Co-located with SIGMOD). [Paper] [Talk] [Poster]
Supporting Similarity Queries in Apache AsterixDB, Taewoo Kim, Wenhai Li, Alexander Behm, Inci Cetindil, Rares Vernica, Vinayak Borkar, Michael Carey, Chen Li, EDBT 2018. [Paper] [Talk]
Drum: A Rhythmic Approach to Interactive Analytics on Large Data, Jianfeng Jia, Chen Li, Michael J. Carey, IEEE Big Data 2017. [Paper] [Talk]
Caching Geospatial Objects in Web Browsers, Taewoo Kim, Vidhyasagar Thirumaraiselvan, Jianfeng Jia, Chen Li, ACM SIGSPATIAL 2017 (Demo Paper). [Paper]
Visual Analytics Ecology for Complex System Testing, Simon Su, Michael Barton, Michael An, Vincent Perry, Chen Li, Jianfeng Jia, Brian Panneton, Visualization in Practice 2017 at IEEE VIS 2017. [Paper]
Twitter Coverage of Climate Change and Health before and after the 2016 US Presidential Election, Suellen Hopfer, Miryha Runnerstrom, Jianfeng Jia, Taewoo Kim, Chen Li, American Public Health Association 2017.
A Demonstration of TextDB: Declarative and Scalable Text Analytics on Large Data Sets, Zuozhi Wang, Flavio Bayer, Seungjin Lee, Kishore Narendran, Xuxi Pan, Qing Tang, Jimmy Wang, Chen Li, ICDE 2017 Demo. (Best Demo award) [PDF] [video].
A Comparative Study of Log-Structured Merge-Tree-Based Spatial Indexes for Big Data, Young-Seok Kim, Taewoo Kim, Michael J. Carey, Chen Li, ICDE 2017 Poster. [PDF]
Hobbes3: Dynamic Generation of Variable-Length Signatures for Efficient Approximate Subsequence Mappings, Jongik Kim, Chen Li, Xiaohui Xie, ICDE 2016. [PDF] [PPT]
Towards Interactive Analytics and Visualization on One Billion Tweets, Jianfeng Jia, Chen Li, Xi Zhang, Chen Li, Michael J. Carey, Simon Su, ACM GIS 2016 (Demo). [PDF]
Negative Factor: Improving Regular-Expression Matching in Strings, Xiaochun Yang, Tao Qiu, Bin Wang, Baihua Zheng, Yaoshu Wang, Chen Li, ACM TODS, 2015.
LSM Based Storage and Indexing: An Old Idea with Timely Benefits, Alsubaiee, S., Carey, M. J., Li, C. GeoRich (2015).
AsterixDB: A Scalable, Open Source BDMS, Sattam Alsubaiee, Yasser Altowim, Hotham Altwaijry, Alexander Behm, Vinayak R. Borkar, Yingyi Bu, Michael J. Carey, Inci Cetindil,Madhusudan Cheelangi, Khurram Faraaz, Eugenia Gabrielova, Raman Grover, Zachary Heilbron, Young-Seok Kim, Chen Li, Guangqiang Li, Ji Mahn Ok, Nicola Onose, Pouria Pirzadeh,Vassilis J. Tsotras, Rares Vernica, Jian Wen, Till Westmann:. PVLDB 7(14): 1905-1916 (2014). [PDF]
Improving read mapping using additional prefix grams, Jongik Kim, Chen Li, Xiaohui Xie. BMC Bioinformatics 15: 42 (2014)
Storage Management in AsterixDB, Sattam Alsubaiee, Alexander Behm, Vinayak R. Borkar, Zachary Heilbron, Young-Seok Kim, Michael J. Carey, Markus Dreseler, Chen Li, PVLDB 7(10): 841-852 (2014). [PDF]
Efficient instant-fuzzy search with proximity ranking, Inci Cetindil, Jamshid Esmaelnezhad, Taewoo Kim, Chen Li. ICDE 2014: 328-339. [PDF]
Supporting Search-As-You-Type Using SQL in Databases, Guoliang Li, Jianhua Feng, Chen Li, IEEE Trans. Knowl. Data Eng. 25(2): 461-475 (2013)
Efficient direct search on compressed genomic data, Xiaochun Yang, Bin Wang, Chen Li, Jiaying Wang, Xiaohui Xie, ICDE 2013: 961-972. [PDF]
Improving regular-expression matching on strings using negative factors, Xiaochun Yang, Bin Wang, Tao Qiu, Yaoshu Wang, Chen Li, SIGMOD Conference 2013: 361-372. [PDF]
Inside “Big Data Management”: Ogres, Onions, or Parfaits? Vinayak Borkar, Michael J. Carey, and Chen Li, EDBT 2012. [PDF]
ASTERIX: An Open Source System for “Big Data” Management and Analysis, Sattam Alsubaiee, Yasser Altowim, Hotham Altwaijry, Alexander Behm, Vinayak R. Borkar, YingyiBu, Michael J. Carey, Raman Grover, Zachary Heilbron, Young-Seok Kim, Chen Li, Nicola Onose, Pouria Pirzadeh, Rares Vernica, Jian Wen. PVLDB 2012 (demo).
Speeding Up Chemical Searches Using the Inverted Index: The Convergence of Chemoinformatics and Text Search Methods, Ramzi Nasr, Rares Vernica, Chen Li, Pierre Baldi. Journal of Chemical Information and Modeling 52(4): 891-900 (2012)
Big data platforms: what’s next? Vinayak R. Borkar, Michael J. Carey, Chen Li. ACM Crossroads 19(1): 44-49, 2012.
Hobbes: optimized gram-based methods for efficient read alignment, Athena Ahmadi, Alexander Behm, Nagesh Honnalli, Chen Li, Lingjie Weng, and Xiaohui Xie, Nucleic Acids Research 2011; doi: 10.1093/nar/gkr1246. [PDF]
SKIF-P: a point-based indexing and ranking of web documents for spatial-keyword search, Ali Khodaei, Cyrus Shahabi, and Chen Li, Geoinformatica, Springer, 2011. [PDF]
Supporting BioMedical Information Retrieval: The BioTracer Approach, Heri Ramampiaro and Chen Li, In Transactions on Large-Scale Data- and Knowledge-Centered Systems (TLDKS), 2011, No.4. Vol. 6990, Springer. pp. 73–94. [PDF]
CHIME: An Efficient Error-Tolerant Chinese Pinyin Input Method, Yabin Zheng, Chen Li, Maosong Sun, IJCAI 2011, 2551-2556. [PDF], [Demo]
Answering Approximate String Queries on Large Data Sets Using External Memory, Alexander Behm, Chen Li, and Michael Carey, ICDE 2011. [PDF] [Source Code]
ASTERIX: towards a scalable, semistructured data platform for evolving-world models. Alexander Behm, Vinayak R. Borkar, Michael J. Carey, Raman Grover, Chen Li, Nicola Onose, Rares Vernica, Alin Deutsch, Yannis Papakonstantinou, Vassilis J. Tsotras, Distributed and Parallel Databases, 2011, 29(3), 185-216. [PDF]
Efficient fuzzy full-text type-ahead search, Guoliang Li, Shengyue Ji, Chen Li, Jianhua Feng:. VLDB J. 20(4): 617-640 (2011). [PDF]
qSpell: Spelling Correction of Web Search Queries using Ranking Models and Iterative Correction. Yasser Ganjisaffar, Andrea Zilio, Sara Javanmardi, Inci Cetindil, Manik Sikka,Sandeep Katumalla, Narges Khatib, Chen Li, Cristina Lopes, Spelling Alteration for Web Search Workshop, July 2011. [PDF], [Dataset] (The authors won the third place in the Microsoft’s speller challenge in 2011.)
The Flamingo Software Package on Approximate String Queries. Chen Li, DASFAA Workshops 2011, 477. [PDF], [Source Code]
Interactive and Fuzzy Search: A Dynamic Way to Explore MEDLINE, Jiannan Wang, Inci Cetindil, ShengyueJi, Chen Li, Xiaohui Xie, Guoliang Li, Jianhua Feng, Journal of Bioinformatics, 2010. [PDF]
Supporting Location-Based Approximate-Keyword Queries, Sattam Alsubaiee, Alexander Behm, and Chen Li, ACM GIS 2010. [PDF] [PPT] [Source Code and Demos]
Hybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents, Ali Khodaei, Cyrus Shahabi, Chen Li, DEXA 2010. [PDF]
Efficient Parallel Set-Similarity Joins Using MapReduce. Rares Vernica, Michael J. Carey, Chen Li, SIGMOD 2010, [PDF], [ source code]
SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents, Guoliang Li, Chen Li, Jianhua Feng, Lizhu Zhou: Inf. Sci. 179(21): 3745-3762 (2009). [PDF]
Human genomes as email attachments. Scott Christley, Yiming Lu, Chen Li, and Xiaohui Xie, Bioinformatics 25: 274-275 (2009). [PDF]. [Source Code]. It was the most downloaded article on the Web site of the Journal of Bioinformatics for two months.
Rewriting Queries using Views, Chen Li: Encyclopedia of Database Systems 2009: 2438-2441. [PDF]
Type-Ahead Search on Relational Data: a TASTIER Approach, Guoliang Li, Shengyue Ji, Chen Li, and Jianhua Feng, SIGMOD 2009. [PDF], [PPTX].
Efficient Interactive Fuzzy Keyword Search, Shengyue Ji, Guoliang Li, Chen Li, and Jianhua Feng, WWW 2009. [PDF], [PPTX]
Best-Effort Top-k Query Processing Under Budgetary Constraints, Michal Shmueli-Scheuer, Chen Li, Yosi Mass, Haggai Roitman, Ralf Schenkel, and Gerhard Weikum, ICDE 2009. [PDF], [PPT]
Space-Constrained Gram-Based Indexing for Efficient Approximate String Search, Alexander Behm, Shengyue Ji, Chen Li, and Jiaheng Lu, ICDE 2009. [PDF], [PPTX]
Efficient top-k algorithms for fuzzy search in string collections, Rares Vernica, Chen Li, KEYS 2009: 9-14, [PDF], [Talk Slides
Efficient Approximate Search on String Collections (Tutorial), Marios Hadjeleftheriou and Chen Li, VLDB 2009. [PDF], [Part I], [Part II]
SEPIA: Estimating Selectivities of Approximate String Predicates in Large Databases. Liang Jin, Chen Li, and Rares Vernica. VLDB Journal, Volume 17, Number 5, pages 1213-1229, August 2008. [PDF]
Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently, Xiaochun Yang, Bin Wang, and Chen Li, ACM SIGMOD 2008. [PDF], [PPT]
Efficient Merging and Filtering Algorithms for Approximate String Searches, Chen Li, Jiaheng Lu, and Yiming Lu. ICDE 2008. [PDF], [PPT], [Source Code].
Data Exchange with Arithmetic Comparisons, Foto Afrati, Chen Li, and Vassia Pavlaki. EDBT 2008. [PDF]
Quality-Aware Retrieval of Data Objects from Autonomous Sources for Web-Based Repositories, Houtan Shirani-Mehr, Chen Li, Gang Liang, Michal Shmueli-Scheuer, ICDE 2008 (poster). [PDF] [Technical Report]
Using Views to Generate Efficient Evaluation Plans for Queries Foto Afrati, Chen Li, and Jeff Ullman, Journal of Computer and System Sciences, Volume 73, Issue 5, pages 703-724, August 2007. [PDF]
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams, Chen Li, Bin Wang, and Xiaochun Yang. VLDB 2007. [PDF], [PPT]
Protecting Individual Information Against Inference Attacks in Data Publishing, Chen Li, Houtan Shirani-Mehr, and Xiaochun Yang. DASFAA 2007. [PDF]
Communication-Efficient Query Answering with Quality Guarantees in Client-Server Applications. Michal Shmueli-Scheuer, Amitabh Chaudhary, Avigdor Gal, Chen Li. WebDB 2007. [PDF]
Rewriting Queries Using Views in the Presence of Arithmetic Comparisons, Foto Afrati, Chen Li, and Prasenjit Mitra, Theoretical Computer Science, Volume 368, Numbers 1-2, pages 88-123, 2006. [PDF]
Supporting Efficient Record Linkage for Large Data Sets Using Mapping Techniques, Chen Li, Liang Jin, and Sharad Mehrotra, World Wide Web Journal, Volume 9, Number 4, pages 557-584, December 2006. [PDF]
Achieving Communication Efficiency through Push-Pull Partitioning of Semantic Spaces to Disseminate Dynamic Information, Amitabha Bagchi, Amitabh Chaudhary, Michael T. Goodrich, Chen Li, and Michal Shmueli-Scheuer. IEEE Transaction on Knowledge and Data Engineering (TKDE), October 2006 (Vol. 18, No. 10). [PDF]
Answering Queries Using Materialized Views with Minimum Size. Rada Chirkova, Chen Li, and Jia Li. VLDB Journal (2006), Volume 15, Number 3, 191-210. [PDF]
Supporting Approximate Similarity Queries with Quality Guarantees in P2P Systems, Qi Zhong, Iosif Lazaridis, Mayur Deshpande, Chen Li, Sharad Mehrotra, Hal Stern, COMAD 2006, December 14-16, 2006, Delhi, India. [PDF]
Relaxing Join and Selection Queries. Nick Koudas, Chen Li, Anthony Tung, and Rares Vernica. VLDB 2006, Seoul, Korea, 2006. (13.2% accepted) [PDF], [PPT], [Source Code]
Selectivity Estimation for Fuzzy String Predicates in Large Data Sets, Liang Jin and Chen Li. VLDB 2005, Trondheim, Norway, August 30 – September 2, 2005. (16% accepted) [PDF], [PPT], [Source Code].
Indexing Mixed Types for Approximate Retrieval, Liang Jin, Nick Koudas, Chen Li, Anthony K.H. Tung.VLDB 2005, Trondheim, Norway, August 30 – September 2, 2005. (16% accepted) [PDF], [PPT], [Source Code].
Quality-Driven Approximate Methods for GIS Data Integration. Ramaswamy Hariharan, Michal Schmueli-Scheuer, Chen Li, and Sharad Mehrotra. ACM GIS 2005, November 4-5th, 2005 Bremen, Germany. [PDF]
Answering Aggregation Queries on Hierarchical Web Sites Using Adaptive Sampling. Foto Afrati, Paraskevas Lekeas, and Chen Li. Technical Report, UCI ICS, August 2005. A short version appears in CIKM’2005, 31st October – 5th November, 2005 Bremen, Germany.
XGuard: A System for Publishing XML Documents without Information Leakage in the Presence of Data Inference. Xiaochun Yang, Chen Li, Ge Yu, and Lei Shi. Proc. of ICDE’2005, demo track, Tokyo, Japan, March 2005.
Secure XML Publishing without Information Leakage in the Presence of Data Inference. Xiaochun Yang and Chen Li. VLDB, Toronto, Canada, August 29 – September 3, 2004. [PDF], [PPT]. (16% accepted)
NNH: Improving Performance of Nearest-Neighbor Searches Using Histograms. Liang Jin, Nick Koudas, Chen Li. EDBT, Crete, Greece, March 2004. (14% accepted) [PDF], [Full version], [PPT]
On Containment of Conjunctive Queries with Arithmetic Comparisons. Foto Afrati, Chen Li, Prasenjit Mitra. EDBT, Crete, Greece, March 2004. (14% accepted) [PDF].
RACCOON: A Peer-Based System for Data Integration and Sharing. Chen Li, Jia Li, Qi Zhong. Proc. of ICDE’2004, demo track. [PDF]
Recent Progress on Selected Topics on Database Research — A Report from Nine Young Chinese Researchers Working in the United States. Zhiyuan Chen, Chen Li, Jian Pei, YufeiTao, Haixun Wang, Wei Wang, Jiong Yang, Jun Yang, and Donghui Zhang. The Journal of Computer Science and Technology. Vol. 18, No. 5, Pages 538 – 552, September 2003. [PDF]
Computing Complete Answers to Queries in the Presence of Limited Access Patterns. Chen Li. The VLDB Journal (2003) 12: 211-227 [PS] [PDF]
Materializing Views with Minimal Size to Answer Queries. Rada Chirkova and Chen Li. ACM PODS, June 2003, San Diego, CA. (20% accepted). [PDF], [PPT]
Efficient Record Linkage in Large Data Sets, Liang Jin, Chen Li, and Sharad Mehrotra, in the 8th International Conference on Database Systems for Advanced Applications (DASFAA 2003) 26 – 28 March, 2003, Kyoto, Japan. (33% accepted) [PS], [PDF], [PPT], [Source Code]. Received DASFAA 2013 10-year Best Paper Award.
Schema-Guided Wrapper Maintenance for Web-Data Extraction. Xiaofeng Meng, Dongdong Hu, Chen Li. To appear in the Fifth International Workshop on Web Information and Data Management (WIDM 2003), New Orleans, Louisiana. [PDF] [PPT].
Using Constraints to Describe Source Contents in Data Integration Systems. Chen Li. IEEE Intelligent Systems 18(5): 49-53 (2003). [PDF]
Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems. Chen Li. IJCAI 2003 workshop on Information Integration on the Web, August 2003, Acapulco,Mexico. [PDF], [PPT]
Clustering for Approximate Similarity Search in High-Dimensional Spaces. Chen Li, Edward Chang, Hector Garcia-Molina, and Gio Wiederhold. IEEE Transaction on Knowledge and Data Engineering, Volume 14, Number 4, pp.792-808, July/August 2002 [PS] [PDF]
Executing SQL over Encrypted Data in the Database-Service-Provider Model. Hakan Hacigumus, Bala Iyer, Chen Li, and Sharad Mehrotra. In ACM SIGMOD, June 3-6, 2002 Madison,Wisconsin. (18% accepted). Received SIGMOD 2012 10-year Test-of-Time Award. [PDF]
Answering Queries Using Views with Arithmetic Comparisons. Foto Afrati, Chen Li, and Prasenjit Mitra. In ACM Symposium on Principles of Database Systems (PODS), June 3-6, 2002, Madison, Wisconsin. (22% accepted)
Answering Queries with Useful Bindings. Chen Li and Edward Chang. ACM Transactions on Database Systems (TODS), Volume 26 , Issue 3 (September 2001).[PS] [PDF]
Generating Efficient Plans for Queries Using Views. Foto Afrati, Chen Li, and Jeff Ullman. In the Proc. of the 30th ACM SIGMOD Conference, Santa Barbara, CA, May, 2001. (15% accepted) [PS] [PDF] [PPT]
Minimizing View Sets without Losing Query-Answering Power. Chen Li, Mayank Bawa, and Jeff Ullman. In the 8th International Conference on Database Theory (ICDT), London, UK, January, 2001. [PS] [PDF], [PPT]. Full version: [PS] [PDF]. (35% accepted)
On Answering Queries in the Presence of Limited Access Patterns. Chen Li and Edward Chang. In the 8th International Conference on Database Theory (ICDT), London, UK, January, 2001. [PS] [PDF] [PPT]. (35% accepted)
Query Processing and Optimization in Information-Integration Systems. Chen Li. Ph.D. Thesis, Computer Science Department, Stanford University, August, 2001.
Query Planning with Limited Source Capabilities. Chen Li and Edward Chang. International Conference on Database Engineering (ICDE), pages 401-412, San Diego, CA, February, 2000. (14% accepted) [PS] [PDF] [PPT]. Full version: [PS] [PDF]
Towards Perception-Based Image Retrieval. Edward Chang, Beitao Li, and Chen Li. Proceedings of IEEE Workshop on Content-based Access of Image and Video Libraries, p. 401-412, South Carolina, June, 2000. [PS] [PDF]
Managing Parallel Disks for Continuous Media Data. Edward Chang, Chen Li, and Hector Garcia-Molina. A Book Chapter in Information Organization & Databases, p.107-120, Kluwer Publisher, 2000. [PS] [PDF]Answering Queries with Database Restrictions (Research Summary). Chen Li. Symposium on Abstraction, Reformulation and Approximation (SARA), pages 328 – 329, July, 2000, Horseshoe Bay (Lake LBJ), Texas. [PS] [PDF]
Computing Capabilities of Mediators. Ramana Yerneni, Chen Li, Hector Garcia-Molina, Jeffrey Ullman. SIGMOD’99, Philadelphia, PA, May 1999. (20% accepted) [PS] [PDF]. Full version: [PS] [PDF]
Optimizing Large Join Queries in Mediation Systems. Ramana Yerneni, Chen Li, Jeffrey Ullman, Hector Garcia-Molina. International Conference on Database Theory (ICDT), Jerusalem,Israel, January, 1999. (29% accepted) [PS] [PDF]. Full version: [PS] [PDF]
Searching Near-Replicas of Images via Clustering. Edward Chang, Chen Li, James Wang, Peter Mork, and Gio Wiederhold. Proc. of SPIE Symposium of Voice, Video, and Data Communications, Multimedia Storage and Archiving Systems VI, pages 281-292, Boston, MA, September, 1999. [PS] [PDF]
Capability Based Mediation in TSIMMIS. Chen Li, Ramana Yerneni, Vasilis Vassalos, Hector Garcia-Molina, Yannis Papakonstantinou, Jeffrey Ullman, Murty Valiveti. Proc. of ACM SIGMOD 1998, demo track, pages 564 – 566, Seattle, WA, June, 1998. [PS] [PDF]
RIME: A Replicated Image Detector for the World-Wide Web. Edward Chang, James Ze Wang, Chen Li, and Gio Wiederhold. Proceedings of SPIE Symposium of Voice, Video, and Data Communications, pages 58–67, Boston, MA, November 1998. [PS] [PDF]
2D BubbleUp: Managing Parallel Disks for Media Servers. Edward Chang, Hector Garcia-Molina, and Chen Li. The 5th International Conference of Foundations of Data Organization (FODO), pages 221-230, Kobe, Japan, 1998. [PS] [PDF]
Performance Analysis of the Communication Mechanism for POE Workstation Cluster. Weiqiang Zhuang, Chen Li, Meiming Shen. Microcomputer & Micro-system, Jan, 1995
HiComm — A New Technique for Improving Communication Performance in Workstation Cluster. Chen Li, Weiqiang Zhuang, Meiming Shen, Dingxing Wang, Weimin Zheng, Proc. of International Workshop on Advanced Parallel Processing Technologies (APPT), October, 1995, Beijing, China.