Themis Palpanas has been involved in the development of the JedAI toolkit, which has released its third version. JedAI is an open-source, modular toolkit for highly-scalable end-to-end Entity Resolution (ER) that is domain-agnostic (no expert knowledge) and structure-agnostic (works with structured, e.g., relational, semi-structured, e.g., RDF, and un-structured, e.g., free-text, entity descriptions). JedAI is described here, and implements most of the state-of-the-art Entity Resolution techniques, described here.

Themis Palpanas has published a new book on The Four Generations of Entity Resolution, discussing all traditional and modern techniques for this problem that lies at the core of data integration and data cleaning.

Our former PhD student, Michele Linardi, wins the BDA (Francophone Data Management Community) Best PhD Thesis Award!

Themis Palpanas has been elected in the VLDB Endowment Board of Trustees.

Contact Info

LIPADE - University of Paris
45 Rue Des Saints-Peres
Paris 75006, France

email: [first name] at "mi.parisdescartes.fr"

tel:  +33-1-7653-0365

in person: once you arrive at the building, walk through the main entrance and straight ahead, make a right to the large hall (with the staircases), take the elevator to the 8th floor, make a left as you exit the elevator, and again left at the end of the corridor; my office is the last one on the right: 812K

I am Senior Member of the French University Institute (IUF), a distinction that recognizes research excellence across all scientific disciplines, ACM Senior Member and Distinguished Speaker, Director of LIPADE (Computer Science department), and Director of the Data Intelligence Institute of Paris (diiP), an IdEx-funded (Initiative of Excellence) interdisciplinary institute that includes the University of Paris, Sciences Po university, the University Sorbonne Paris Nord, and the French Institute for Demographic Studies (INED).

I am organizing the University of Paris Seminar Series on Data Analytics.

My research is on problems related to data science, focusing on online and offline big data analytics and machine learning applications. It has been funded by the EU, IUF, CNRS, Facebook, IBM Research, HP Labs, Safran, Huawei, FMJH (supported by EDF and Thales), Telecom Italia, and PAT, and has led to an IBM Shared University Research Award, 3 Best Paper awards in international conferences, 9 US Patents (3 are part of commercial systems), and 2 French Patents.


diiP Institute

I am the Director of the Data Intelligence Institute of Paris (diiP), which is an IdEx-funded (Initiative of Excellence) interdisciplinary institute, with the goals to support and foster interdisciplinary collaborations around topics relevant to data science and data intelligence. The partner institutions are the University of Paris, Sciences Po university, the University Sorbonne Paris Nord, and the French Institute for Demographic Studies (INED).


Seminar Series on Data Analytics

I am organizing the University of Paris Seminar Series on Data Analytics, which regularly host prominent researchers in the general areas of data and information management, processing, and analysis.


Research Interests

The topics that keep me busy:
  • Data Series Indexing and Mining
  • Data Management
  • Data Analytics
  • Streaming Data Processing
  • Objectivity Analysis
  • Semantic-Web Engineering
My group has world-leading expertise on data series (a.k.a. time series) management, indexing, and analysis, which we now integrate in the nestor project (data series management system), and further developing in the PLATON project (distributed processing).

We recently released our extensive experimental comparisons on exact and approximate data series similarity search (surprisingly, these techniques prove to be the methods of choice for general high-dimensional vectors, as well, under many different conditions).

We have developed the current state-of-the-art data series indexes, iSAX2+ (bulk loading), DPiSAX (distributed), and ADS+ (adaptive), ParIS+ (modern hardware), MESSI (in-memory), SING (GPU-enabled), Coconut (balanced data series index based on sortable summarizations) and Coconut-LSM (data series index for streaming data and queries in ad-hoc windows in the past), ULISSE (supporting variable-length queries), SEAnet (deep learning summarizations), the only data series query workload benchmark, as well as DSStat, a toolset for data series preprocessing and visualization. We have also developed the state-of-the-art techniques for progressive similarity search, and for subsequence anomaly detection: NormA, SAND, and Series2Graph.

We have applied our techniques on streaming and uncertain data series, and have worked with data from diverse domains, such as home networks, road tunnels, and manufacturing. In 2016, we organized the two editions of the International Interdisciplinary Workshop on Time Series Analysis (edition 1, edition 2), and in 2019, the Dagstuhl Seminar on Data Series Management, which attracted researchers and practitioners from computer science, astrophysics, neuroscience, music, engineering, and manufacturing. Our tutorials in the areas of data series and high-dimensional data similarity search have appeared in ISCC 2020, IEEE BigData 2020, EDBT 2021, ICDE 2021, and VLDB 2021.



I am constantly learning from my students:

Short Bio

I am Senior Member of the French University Institute (IUF), Professor 1CL in LIPADE, the computer science department of the University of Paris, and Director of the Data Intelligence Institue of Paris (diiP), LIPADE, and the Data Intensive and Knowledge Oriented Systems (diNo) group. I am Member of the Board of Trustees of the VLDB Endowment, Associate Editor for VLDB 2019, Editor in Chief for the Big Data Research (BDR) journal, and Associate Editor for the Transactions on Knowledge and Data Engineering (TKDE) journal.

I have been General Chair for VLDB 2013 (the premier international conference on very large databases), and a founder and director of dbTrento (the Data and Information Management Group of the University of Trento).

I have worked at the University of Trento, the IBM T.J. Watson Research Center, and at the University of California at Riverside. I have also held visiting researcher positions at the National University of Singapore, Microsoft Research, and the IBM Almaden Research Center.

I received my BSc degree from the National Technical University of Athens, Greece, and then received the MSc and PhD degrees from the University of Toronto, Canada.

Trivium: my Erdos Number is 3.