Research organisations
INED cooperates and works with several other French research organisations, in particular INSERM, IRD and the CNRS. Like INED, these institutes are EPSTs (Etablissements Publics à Caractère Scientifique et Technologique).
INED cooperates and works with several other French research organisations, in particular INSERM, IRD and the CNRS. Like INED, these institutes are EPSTs (Etablissements Publics à Caractère Scientifique et Technologique).
The National Institute of Statistics and Economic Studies collects, analyses and disseminates information on the French economy and society
The Socface project brings together archivists, demographers, economists, historians, and computer scientists to develop technologies for the large-scale processing of huge series of historical documents. Based on automated handwriting recognition, the project aims at analyzing all nominal census lists from 1836 to 1936 (20 censuses). This will produce a database of all individuals who lived in France between 1836 and ...
The Socface project aims at developing automatic handwriting recognition technologies to transcribe all the census lists from 1836 to 1936 (i.e. 20 censuses). The information gathered will be used to produce a database of all individuals who lived in France during this period and to follow them throughout their lives. Then, we will take advantage of this database to analyze ...
Our objectives are threefold. First, Socface will develop new methods to extract individual-level data from a very large set of archival document images, taking advantage of the inherent structure of the original document, and with auto-evaluation of the quality of the process. Second, the project will be the first in France to link records from the same individuals over time ...
Produced every five years starting in 1836, the listes nominatives are a summary of the census, providing information on each individual with some of his or her characteristics, such as name, year of birth, and occupation (see the presentation of the source on FranceArchives). They are organized spatially, each individual being located in a household, itself located in a house, ...
The huge number of listes (about 15 million images from 1836 to 1936, corresponding to 700 million individual records) and their spatial dispersion (they are kept in nearly one hundred archives deposits) have limited their use until now. The Socface project intends to overcome this limitation by using the most recent advances in machine learning technologies. Taking advantage of the regularity ...
The first step of the project is to get millions of images from a hundred archive depositories, understand the document structure, and recognize the handwritten text. A portal will allow archive depositories and their publishers to upload images and associated metadata. Then, automated methods will be mobilized to extract the information they contain: line detection, text recognition, consistency tests, etc. ...
The text recognition from all images of the listes nominatives will contribute to build a set of individual data structured by municipality and by census year: millions of lines, each containing the name and surname of an individual (and some of his characteristics). Of course, many of these lines correspond to the same person, observed in different censuses. Our objective is ...
The transcriptions of the listes nominatives obtained through the project will be made available to the general public, allowing anyone to browse freely through millions of records. They will be made available both on FranceArchives and on the websites of the various Archives Départementales.