T h e m i s P a l p a n a s
Ex PhD student, Paul Boniol, wins the Paul Caseau PhD Prize, the BDA Best PhD Thesis Award (the 2nd time for our group, after Michele Linardi in 2019), the INFORSID PhD Prize, and the Lambda-Mu Research and Industry Prize, for his work on scalable data series anomaly detection, summarized in this 3min video.
Involved in the development of the JedAI toolkit, which has released its third version. JedAI is an open-source, modular toolkit for highly-scalable end-to-end Entity Resolution (ER) that is domain-agnostic (no expert knowledge) and structure-agnostic (works with structured, e.g., relational, semi-structured, e.g., RDF, and un-structured, e.g., free-text, entity descriptions). JedAI is described here, and implements most of the state-of-the-art Entity Resolution techniques, described here.
Published a new book on The Four Generations of Entity Resolution, discussing all traditional and modern techniques for this problem that lies at the core of data integration and data cleaning.
LIPADE - University of Paris
45 Rue Des Saints-Peres
Paris 75006, France
email: [first name] at "mi.parisdescartes.fr"
in person: once you arrive at the building, walk through the main entrance and straight ahead, make a right to the large hall (with the staircases), take the elevator to the 8th floor, make a left as you exit the elevator, and again left at the end of the corridor; my office is the last one on the right: 812K
I am Distinguished Professor of Computer Science at Universite Paris Cite, Senior Member of the French University Institute (IUF), a distinction that recognizes research excellence across all scientific disciplines, ACM Senior Member and Distinguished Speaker, Director of LIPADE (Computer Science department), and Director of the Data Intelligence Institute of Paris (diiP), an IdEx-funded (Initiative of Excellence) interdisciplinary institute that includes the Universite Paris Cite, Sciences Po university, the University Sorbonne Paris Nord, and the French Institute for Demographic Studies (INED).
I am organizing the University of Paris Seminar Series on Data Analytics.
My research is on problems related to data science, focusing on online and offline big data analytics and machine learning applications. It has been funded by the EU, IUF, CNRS, Facebook, IBM Research, HP Labs, Safran, Huawei, FMJH (supported by EDF and Thales), Telecom Italia, and PAT, and has led to an IBM Shared University Research Award, 3 Best Paper awards in international conferences, 9 US Patents (3 are part of commercial systems), and 2 French Patents.
diiP Institute I am the Director of the Data Intelligence Institute of Paris (diiP), which is an IdEx-funded (Initiative of Excellence) interdisciplinary institute, with the goals to support and foster interdisciplinary collaborations around topics relevant to data science and data intelligence. The partner institutions are the Universite Paris Cite, Sciences Po university, the University Sorbonne Paris Nord, and the French Institute for Demographic Studies (INED).
Seminar Series on Data Analytics I am organizing the Universite Paris Cite Seminar Series on Data Analytics, which regularly host prominent researchers in the general areas of data and information management, processing, and analysis.
Research InterestsThe topics that keep me busy:
We recently released our extensive experimental comparisons on exact and approximate data series similarity search (surprisingly, these techniques prove to be the methods of choice for general high-dimensional vectors, as well, under many different conditions).
We have developed the current state-of-the-art data series indexes, iSAX2+ (bulk loading), DPiSAX (distributed), and ADS+ (adaptive), ParIS+ (modern hardware), MESSI (in-memory), SING (GPU-enabled), Coconut (balanced data series index based on sortable summarizations) and Coconut-LSM (data series index for streaming data and queries in ad-hoc windows in the past), ULISSE (supporting variable-length queries), SEAnet (deep learning summarizations), the only data series query workload benchmark, as well as DSStat, a toolset for data series preprocessing and visualization. We have also developed the state-of-the-art techniques for progressive similarity search, and for subsequence anomaly detection: NormA, SAND, and Series2Graph.
We have applied our techniques on streaming and uncertain data series, and have worked with data from diverse domains, such as home networks, road tunnels, and manufacturing. In 2016, we organized the two editions of the International Interdisciplinary Workshop on Time Series Analysis (edition 1, edition 2), and in 2019, the Dagstuhl Seminar on Data Series Management, which attracted researchers and practitioners from computer science, astrophysics, neuroscience, music, engineering, and manufacturing. Our tutorials in the areas of data series and high-dimensional data similarity search have appeared in ISCC 2020, IEEE BigData 2020, EDBT 2021, ICDE 2021, and VLDB 2021.
StudentsI am constantly learning from my students:
I am Senior Member of the French University Institute (IUF), Distinguished Professor (CL Ex) in LIPADE, the computer science department of the Universite Paris Cite, and Director of the Data Intelligence Institue of Paris (diiP), LIPADE, and the Data Intensive and Knowledge Oriented Systems (diNo) group. I am Member of the Board of Trustees of the VLDB Endowment, Associate Editor for VLDB 2019, Editor in Chief for the Big Data Research (BDR) journal, and Associate Editor for the Transactions on Knowledge and Data Engineering (TKDE) journal.
I have been General Chair for VLDB 2013 (the premier international conference on very large databases), and a founder and director of dbTrento (the Data and Information Management Group of the University of Trento).
I have worked at the University of Trento, the IBM T.J. Watson Research Center, and at the University of California at Riverside. I have also held visiting researcher positions at the National University of Singapore, Microsoft Research, and the
IBM Almaden Research Center.
Trivium: my Erdos Number is 3.