Talks and presentations

Workshop GAST 2020

January 18, 2020

Workshop organization, Conférence Extraction et Gestion des Connaissances (EGC) 2020, Bruxelles, Belgique

Le sixième atelier — Gestion et Analyse des données Spatiales et Temporelles (GAST) — sera organisé lors d’EGC 2020. Cet atelier, s’appuyant sur le Groupe de Travail GAST, vise à regrouper les chercheurs, du domaine académique et de l’industrie, qui s’intéressent aux problématiques liées à la prise en compte de l’information temporelle ou spatiale – quantitative ou qualitative – dans leurs processus de gestion et d’analyse de données (méthodes et application de l’extraction, la gestion, la représentation, l’analyse et la visualisation d’informations).

Adapting and integrating existing open source projects

January 09, 2020

Talk, University of Nevada, Reno, NV, USA

I lead the session about ‘Adapting and integrating existing open source projects’.

Workshop: Ethical Visualization in the Age of Big Data. Contemporary Cultural Implications of Pre-Twentieth-Century French Texts. A workshop to seek interdisciplinary expert perspectives on ethically and visually representing the historical place of misrepresented peoples and locales.

13th Workshop on Geographic Information Retrieval (GIR)

December 08, 2019

Workshop organization, 13th Workshop on Geographic Information Retrieval (GIR), Lyon, France

The 13th Workshop on Geographic Information Retrieval will be held in Lyon, France from the 28th-29th November 2019. This workshop will address all aspects of Geographic Information Retrieval - including but not limited to the provision of methods to retrieve and analyse geo-spatial textual content, identify the geographic scope and relevance rank documents or other resources from both unstructured and partially structured collections.
Organized by Ross Purves, Chris Jones, Ludovic Moncla and Mauro Gaio

GeoDISCO: Encyclopedic Geographical Discourse in France from the Enlightenment to Wikipedia

December 08, 2019

Talk, 13th Workshop on Geographic Information Retrieval (GIR), Lyon, France

Authors: Denis Vigier, Thierry Joliveau, Ludovic Moncla, Katherine McDonough, and Alice Brenon
Abstract: The GeoDISCO project aims at studying the major changes in encyclopedic geographical discourse in France between 1751 (when the first volume of the Encyclopédie ou dictionnaire raisonné des sciences, des arts et des métiers, by Diderot and D’Alembert, was published) and today (Wikipedia-France, 2018). Using linguistic and GIS methods to investigate patterns in geographical content will help us understand why authors deployed language in such ways that use place as a scaffold for ideas and practices. The spatial history of French encyclopedias is a foundation for asking broader questions about the relationship between early modern geographical information and digital geographical resources.

Workshop: 13th Workshop on Geographic Information Retrieval (GIR)
Organized by Ross Purves, Chris Jones, Ludovic Moncla and Mauro Gaio

Spatial Entity Matching with GeoAlign (demo paper)

November 07, 2019

Talk, 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Chicago, IL, USA

Authors: Nelly Barret, Fabien Duchateau, Franck Favetta, and Ludovic Moncla
Abstract: Points of interest (POI) are central in many applications such as tourism, itinerary search, crisis management. Cartographic providers usually represent these POI with a spatial entity. However, the description of these entities may significantly vary from one provider to another (e.g., missing properties, outdated information, conflicting values). Spatial entity matching (or record linkage) aims at detecting correspondences between entities referring to the same POI. Most existing approaches have a fixed function for combining similarity measures, thus limiting customization. Besides, evaluating the matching quality is a difficult task since a ground truth dataset cannot be built for all entities and providers. In this paper, we describe GeoAlign, an application that allows fine-grained tuning for spatial entity matching. A merging step is also provided using different strategies. Finally, we propose to estimate the quality of correspondences based on the differences between combination functions and to visualize this estimation in GeoAlign.

3rd ACM SIGSPATIAL International Workshop on Geospatial Humanities

November 05, 2019

Workshop organization, 3rd ACM SIGSPATIAL International Workshop on Geospatial Humanities, Chicago, IL, USA

Following the success of previous editions in 2017 and 2018, this workshop concerns with the use of geographic information systems and other spatial technologies in humanities research, placing a strong emphasis on new methodologies that leverage the aforementioned technical developments (e.g., the above-mentioned standard tools from geographic information systems, as well as more advanced methods such as text-based geographical analysis or spatial simulation, can all benefit from innovative approaches leveraging machine learning, parallel and/or distributed computation, semantic technologies, etc.). The workshop aims to bring together researchers and practitioners from different sub-fields of computer science and the geographical information sciences, interested in the application of spatial methods and technology to the humanities, to discuss progress in the field. Participants will explore and demonstrate the contributions to knowledge that modern GIS technologies can enable within and beyond the digital humanities.
Organized by Bruno Martins, Ludovic Moncla and Patricia Murrieta-Flores

Toponym Disambiguation in Historical Documents Using Network Analysis of Qualitative Relationships

November 05, 2019

Talk, 3rd ACM SIGSPATIAL International Workshop on Geospatial Humanities, Chicago, IL, USA

Authors: Ludovic Moncla, Katherine McDonough, Denis Vigier, Thierry Joliveau, and Alice Brenon
Abstract: In this paper we use network analysis to identify qualitative “neighbors” for toponyms in an eighteenth-century French encyclopedia, but could apply to any entry-based text with annotated toponyms. This method draws on relations in a corpus of articles, which improves disambiguation at a later stage with an external resource. We suggest the network as an alternative to geospatial representation, a useful proxy when no historical gazetteer exists for the source material’s period. Our first experiments have shown that this approach goes beyond a simple text analysis and is able to find relations between toponyms that are not co-occurring in the same documents. Network relations are also usefully compared with disambiguated toponyms to evaluate geographical coverage, and the ways that geographical discourse is expressed, in historical texts.
Organized by Bruno Martins, Ludovic Moncla and Patricia Murrieta-Flores

Towards the geoparsing and geocoding of enviromental narratives

April 10, 2019

Talk, Environmental Narratives Workshop, Stels, Switzerland

Abstract: In this talk I briefly describe some of our previous and current works on geographic information retrieval. Then, I introduce some first results that show how our works can be linked to English narratives and particularly how it can be used for geoparsing and geocoding environmental narratives.
Organized by Ross Purves, Olga Koblet, and Ben Adams,

Cartographier les odonymes de Paris citées dans les romans du XIXème siècle

November 06, 2018

Talk, Atelier Humanités Numériques Spatialisées, SAGEO 2018, Montpellier, France

Authors: Ludovic Moncla, Mauro Gaio, Thierry Joliveau
Abstract: In this article, we address two gaps in NLP research: working with his- torical French and working with complex textual structures moving beyond running text or lists of place names. Our methodology is based on the evaluation of the results of two spatial named entity recognition tools in the context of early modern document analysis structured as dictionaries.
Organized by Carmen Brando, Francesca Frontini, and Mathieu Roche.

Expérimentation de méthodes d’extraction d’informations géographiques pour les documents historiques.

November 06, 2018

Talk, Atelier Humanités Numériques Spatialisées, SAGEO 2018, Montpellier, France

Authors: Katherine McDonough, Ludovic Moncla, and Matje van de Camp
Abstract: In this article, we address two gaps in NLP research: working with his- torical French and working with complex textual structures moving beyond running text or lists of place names. Our methodology is based on the evaluation of the results of two spatial named entity recognition tools in the context of early modern document analysis structured as dictionaries.
Organized by Carmen Brando, Francesca Frontini, and Mathieu Roche.

Automated geoparsing of paris street names in 19th century novels.

November 07, 2017

Talk, 1st ACM SIGSPATIAL International Workshop on Geospatial Humanities, Redondo Beach, CA, USA

Authors: Ludovic Moncla, Mauro Gaio, Thierry Joliveau, and Yves-François Le Lay
Abstract: Our project involves building a platform able to retrieve, map and analyze the occurrences of place names in fictional novels published between 1800 and 1914 and whose action occurs wholly or partly in Paris. We describe a proof of concept using queries made via the TXM textual analysis platform for the extraction of street names. Then, we propose a fully automatic process using the named entity recognition (NER) components of the PERDIDO platform. This paper describes some encouraging initial results obtained by combining NLP approaches (NER methods) with textometric tools for the automated geoparsing of street names.
Organized by Bruno Martins and Patricia Murrieta-Flores

Extended Named Entity Recognition Using Finite-State Transducers: An Application To Place Names.

November 07, 2017

Talk, 9th International Conference on Advanced Geographic Information Systems, Applications, and Services, Nice, France

Authors: Mauro Gaio, Ludovic Moncla
Abstract: The textual geographical information is frequently or- ganized around spatial named entities. Such entities have intrinsic ambiguities and Named Entity Recognition and Classification methods should be improved in order to handle this problem. This article describes a knowledge-based method implementing a full process with the aim of annotating in a more precise way the spatial information in the textual documents. This gain in accuracy guarantees a better analysis of the spatial information and a better disambiguation of places. The backbone of our proposal is a construction grammar and a cascaded finite-state transducers. The evaluation shows that the introduced concept of hierarchical overlapping, is very helpful to detect a local context associated with Named Entities.

Pluridisciplinary aspects of NLP and GIS: an application to itinerary reconstruction

September 20, 2017

Poster, RDA 10th Plenary Meeting, Montréal, Canada

Authors: Ludovic Moncla
Abstract: One of the main challenge of this work is to connect text with geographicspaceand to provide a map-based representation of itineraries described intextual documents. The main objectives are:

  • data mining forGeographic Information Retrieval(GIR),
  • toponym resolution and disambiguation,
  • extract and retrieve displacement fromtextual documents.

Geocoding for texts with fine-grain toponyms

November 05, 2014

Talk, 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Dallas, TX, USA

Authors: Ludovic Moncla, Walter Renteria-Agualimpia,Javier Nogueras-Iso, and Mauro Gaio
Abstract: Geoparsing and geocoding are two essential middleware ser- vices to facilitate final user applications such as location- aware searching or different types of location-based services. The objective of this work is to propose a method for es- tablishing a processing chain to support the geoparsing and geocoding of text documents describing events strongly lin- ked with space and with a frequent use of fine-grain topo- nyms. The geoparsing part is a Natural Language Proces- sing approach which combines the use of part of speech and syntactico-semantic combined patterns (cascade of transdu- cers). However, the real novelty of this work lies in the geoco- ding method. The geocoding algorithm is unsupervised and takes profit of clustering techniques to provide a solution for disambiguating the toponyms found in gazetteers, and at the same time estimating the spatial footprint of those other fine-grain toponyms not found in gazetteers. The fea- sibility of the proposal has been tested with a corpus of hiking descriptions in French, Spanish and Italian.

Automatic itinerary reconstruction from texts

September 25, 2014

Talk, 8th International Conference on Geographic Information Science (GIScience 2014), Vienna, Austria

Authors: Ludovic Moncla, Mauro Gaio, and Sébastien Mustière
Abstract: This paper proposes an approach for the reconstruction of itineraries extracted from narrative texts. This approach is divided into two main tasks. The first extracts geographical information with natural language processing. Its outputs are annotations of so called expanded entities and expressions of displacement or perception from hiking descriptions. In order to reconstruct a plausible footprint of an itinerary described in the text, the second task uses the outputs of the first task to compute a minimum spanning tree.

Topographic subtyping of place named entities: a linguistic approach

May 15, 2013

Talk, 16th AGILE conference on Geographic Information Science, Leuven, Belgium

Authors: Van Tien Nguyen, Mauro Gaio, and Ludovic Moncla
Abstract: The aim of this work is to find sub-types for Place Named Entities, from the analysis of relations between Place Names and a nominal group within a specific phrasal context. The proposed method combines the use of specific intra-sentential lexico-syntactic relations and external resources like gazetteers, thesauri, or ontologies. It relies on expanded spatial named entities recognition transcribed into a symbolic representation expressed in terms of semantic features. This symbolic representation will then be associated with a geo-coded representation, depending on the available resources. Our method is completely implemented and has been tested on a corpus of travelogues.