No description
Find a file
2025-09-02 18:12:08 +02:00
data/manual Minor viz updates for the paper 2025-09-02 18:12:08 +02:00
src Minor viz updates for the paper 2025-09-02 18:12:08 +02:00
.gitignore Add OrganizationTranslator. Closes #2 2025-06-01 15:52:07 +02:00
main.py Minor viz updates for the paper 2025-09-02 18:12:08 +02:00
package-lock.json Introduce descriptive analysis 2025-06-03 01:40:14 +02:00
pyproject.toml Add hypotheses analysis 2025-08-13 20:36:12 +02:00
README.md Remove nested quotes in f string 2025-08-14 22:49:16 +02:00

Paradiplomacy in German Subnational Parliaments

This script analyzes the references to international organizations in German state parliaments.

Usage Instructions

Setup

  1. Clone this repository
  2. Install the necessary dependencies. These are stored in the pyproject.toml file. First, create a virtual environment (e. g. python -m venv venv) and enter it (e. g. source venv/bin/activate), then install the requirements by running pip install .. If you want to install not only the necessary packages but also dev dependencies, run pip install -e ".[dev]".
  3. Run the main.py script (e. g. python main.py) from the root directory to run the analysis.

Reference Collection

When no references exist in the database, you have the option to:

  1. Load prepared references - Quick option using pre-processed data
  2. Collect on your own - Full processing pipeline (slower, resource-intensive)

If references already exist, you have the option to update them or skip this step.

Model Training & Automated Tagging

You have three options available:

  1. Train Model - Interactive manual classification of sample references
  2. Run Automated Tagging - Use existing training data to classify all references automatically
  3. Skip - Proceed without classification (limits available analyses)

Analysis

For the analysis, you have the option to run various analyses:

General Statistics (1-6): Generate descriptive analyses and visualizations for states, political affiliations, strategies, timelines, and streamgraphs.

Hypothesis Testing (11-13): Test research hypotheses about strategy distribution, affiliation-strategy relationships, and state variations.

All analyses generate both charts (PNG/PGF) and raw data (CSV) files.

Classification Categories

The project uses a structured taxonomy of argumentation strategies:

Deflection

  • Scapegoating: Highlight mistakes and failures of other member entities
  • Passing Responsibility: Present the IO as being responsible for decisions or policies

Validation

  • Authority Argument: The IO is used as argument by authority
  • Self-Praise: Emphasize own achievements and successes in comparison to other member entities

Legitimation

  • Showcasing Cooperation: Illustrate the various forms of international cooperation and their contribution to formulating and implementing adequate policies
  • Highlighting: Highlighting the value of the IO - Clarify the benefits and advantages of being member of the IO

Unfrequently Asked Questions

Can I manually manage the CPU und RAM usage?

Yes, you can. The reference collection requires an significant amount of hardware ressources. It is capped by the maximum amount of cpu cores less 2 as well as 60 % of the available memory. These values can be overrided by using the environment variables THREAD_LIMIT=foo for CPU usage or MEMORY_LIMIT=bar for memory usage. This does not affect the model training. If you decide to train the model on your own, make sure your system won't be negatively affected

Are translations cached?

Yes, translations are cached and stored in data/processed/translation_cache.json. If you want to update a specific translation, remove the respective element. If you want to start the translation process from scratch, e. g. because the organizations were updated, delete the translation cache file.

How is the database structured?

References Table Stores identified references to international organizations in parliamentary speeches.

Column Name Type Default Description
organization VARCHAR Name of the referenced international organization
id INTEGER Unique identifier for the reference
protocol VARCHAR Protocol identifier
state VARCHAR German state (Bundesland)
period INTEGER Parliamentary period
nth INTEGER Session number
date DATE Date of the parliamentary session
sequence_number INTEGER Order of the paragraph within the session
speaker_extracted VARCHAR Extracted speaker name
speaker_id VARCHAR StatePol speaker identifier
speaker_name VARCHAR StatePol speaker name
affiliation VARCHAR Political party or functional affiliation
content VARCHAR Content of the speech segment containing the reference

Classifications Table Stores manual and automatic classifications of reference types and argumentation strategies.

Column Name Type Default Description
reference_id INTEGER Unique identifier linking to references table
tag_manually VARCHAR Manual classification tag
tag_automatically VARCHAR Automatic classification tag
confidence FLOAT Confidence score for automatic classification
last_updated TIMESTAMP CURRENT_TIMESTAMP Last modification timestamp

Paragraphs Table Contains parliamentary speech paragraphs from StateParl.

Column Name Type Default Description
id INTEGER Unique identifier for the paragraph
protocol VARCHAR Protocol identifier
state VARCHAR German state (Bundesland)
period INTEGER Parliamentary period
nth INTEGER Session number
date DATE Date of the parliamentary session
sequence_number INTEGER Order of the paragraph within the session
speaker_extracted VARCHAR Extracted speaker name
speaker_id VARCHAR StatePol speaker identifier
speaker_name VARCHAR StatePol speaker name
affiliation VARCHAR Political party or functional affiliation
content VARCHAR Content of the paragraph