Package python_gedcom_2

A Python module for parsing, analyzing, and manipulating GEDCOM files.

Installation

The module can be installed via pipenv or simply pip.

Run pip3 install python-gedcom to install or pip3 install python-gedcom --upgrade to upgrade to the newest version uploaded to the PyPI repository.

Tip

Using pipenv simplifies the installation and maintenance of dependencies.

Pre-releases

If you want to use the latest pre-release of the python-gedcom package, simply append the --pre option to pip3: pip3 install python-gedcom --pre

Important classes and modules

  • gedcom.parser.Parser: The actual GEDCOM parser.
  • gedcom.tags: GEDCOM tags like gedcom.tags.GEDCOM_TAG_INDIVIDUAL (INDI) or gedcom.tags.GEDCOM_TAG_NAME (NAME)
  • gedcom.element: Contains all relevant elements generated by a gedcom.parser.Parser.

Example usage

When successfully installed you may import the gedcom package and use it like so:

from python_gedcom_2.element.individual import IndividualElement
from python_gedcom_2.parser import Parser

# Path to your ".ged" file
file_path = ''

# Initialize the parser
gedcom_parser = Parser()

# Parse your file
gedcom_parser.parse_file(file_path)

root_child_elements = gedcom_parser.get_root_child_elements()

# Iterate through all root child elements
for element in root_child_elements:

    # Is the "element" an actual "IndividualElement"? (Allows usage of extra functions such as "surname_match" and "get_name".)
    if isinstance(element, IndividualElement):

        # Get all individuals whose surname matches "Doe"
        if element.surname_match('Doe'):

            # Unpack the name tuple
            (first, last) = element.get_name()

            # Print the first and last name of the found individual
            print(first + " " + last)

Tip

Please have a look at the test files found in the tests/ directory in the source code on GitHub.

Strict parsing

Large sites like Ancestry and MyHeritage (among others) don't always produce perfectly formatted GEDCOM files. If you encounter errors in parsing, you might consider disabling strict parsing which is enabled by default:

from python_gedcom_2.parser import Parser

file_path = '' # Path to your `.ged` file

gedcom_parser = Parser()
gedcom_parser.parse_file(file_path, False) # Disable strict parsing

Disabling strict parsing will allow the parser to gracefully handle the following quirks:

  • Multi-line fields that don't use CONC or CONT
  • Handle the last line not ending in a CRLF (\r\n)
Expand source code
"""
A Python module for parsing, analyzing, and manipulating GEDCOM files.

.. include:: ./gedcom.md
"""

Sub-modules

python_gedcom_2.element

Module containing all relevant elements generated by a gedcom.parser.Parser. An element represents a line within GEDCOM data …

python_gedcom_2.element_creator
python_gedcom_2.helpers

Helper methods.

python_gedcom_2.parser

Module containing the actual gedcom.parser.Parser used to generate elements - out of each line - which can in return be manipulated.

python_gedcom_2.tags

GEDCOM tags.