- Python 100%
| data/manual | ||
| src | ||
| .gitignore | ||
| main.py | ||
| package-lock.json | ||
| pyproject.toml | ||
| README.md | ||
Paradiplomacy in German Subnational Parliaments
This script analyzes the references to international organizations in German state parliaments.
Usage Instructions
Setup
- Clone this repository
- Install the necessary dependencies. These are stored in the
pyproject.tomlfile. First, create a virtual environment (e. g.python -m venv venv) and enter it (e. g.source venv/bin/activate), then install the requirements by runningpip install .. If you want to install not only the necessary packages but also dev dependencies, runpip install -e ".[dev]". - Run the
main.pyscript (e. g.python main.py) from the root directory to run the analysis.
Reference Collection
When no references exist in the database, you have the option to:
- Load prepared references - Quick option using pre-processed data
- Collect on your own - Full processing pipeline (slower, resource-intensive)
If references already exist, you have the option to update them or skip this step.
Model Training & Automated Tagging
You have three options available:
- Train Model - Interactive manual classification of sample references
- Run Automated Tagging - Use existing training data to classify all references automatically
- Skip - Proceed without classification (limits available analyses)
Analysis
For the analysis, you have the option to run various analyses:
General Statistics (1-6): Generate descriptive analyses and visualizations for states, political affiliations, strategies, timelines, and streamgraphs.
Hypothesis Testing (11-13): Test research hypotheses about strategy distribution, affiliation-strategy relationships, and state variations.
All analyses generate both charts (PNG/PGF) and raw data (CSV) files.
Classification Categories
The project uses a structured taxonomy of argumentation strategies:
Deflection
- Scapegoating: Highlight mistakes and failures of other member entities
- Passing Responsibility: Present the IO as being responsible for decisions or policies
Validation
- Authority Argument: The IO is used as argument by authority
- Self-Praise: Emphasize own achievements and successes in comparison to other member entities
Legitimation
- Showcasing Cooperation: Illustrate the various forms of international cooperation and their contribution to formulating and implementing adequate policies
- Highlighting: Highlighting the value of the IO - Clarify the benefits and advantages of being member of the IO
Unfrequently Asked Questions
Can I manually manage the CPU und RAM usage?
Yes, you can. The reference collection requires an significant amount of hardware ressources. It is capped by the maximum amount of cpu cores less 2 as well as 60 % of the available memory. These values can be overrided by using the environment variables THREAD_LIMIT=foo for CPU usage or MEMORY_LIMIT=bar for memory usage. This does not affect the model training. If you decide to train the model on your own, make sure your system won't be negatively affected
Are translations cached?
Yes, translations are cached and stored in data/processed/translation_cache.json. If you want to update a specific translation, remove the respective element. If you want to start the translation process from scratch, e. g. because the organizations were updated, delete the translation cache file.
How is the database structured?
References Table Stores identified references to international organizations in parliamentary speeches.
| Column Name | Type | Default | Description |
|---|---|---|---|
| organization | VARCHAR | Name of the referenced international organization | |
| id | INTEGER | Unique identifier for the reference | |
| protocol | VARCHAR | Protocol identifier | |
| state | VARCHAR | German state (Bundesland) | |
| period | INTEGER | Parliamentary period | |
| nth | INTEGER | Session number | |
| date | DATE | Date of the parliamentary session | |
| sequence_number | INTEGER | Order of the paragraph within the session | |
| speaker_extracted | VARCHAR | Extracted speaker name | |
| speaker_id | VARCHAR | StatePol speaker identifier | |
| speaker_name | VARCHAR | StatePol speaker name | |
| affiliation | VARCHAR | Political party or functional affiliation | |
| content | VARCHAR | Content of the speech segment containing the reference |
Classifications Table Stores manual and automatic classifications of reference types and argumentation strategies.
| Column Name | Type | Default | Description |
|---|---|---|---|
| reference_id | INTEGER | Unique identifier linking to references table | |
| tag_manually | VARCHAR | Manual classification tag | |
| tag_automatically | VARCHAR | Automatic classification tag | |
| confidence | FLOAT | Confidence score for automatic classification | |
| last_updated | TIMESTAMP | CURRENT_TIMESTAMP | Last modification timestamp |
Paragraphs Table Contains parliamentary speech paragraphs from StateParl.
| Column Name | Type | Default | Description |
|---|---|---|---|
| id | INTEGER | Unique identifier for the paragraph | |
| protocol | VARCHAR | Protocol identifier | |
| state | VARCHAR | German state (Bundesland) | |
| period | INTEGER | Parliamentary period | |
| nth | INTEGER | Session number | |
| date | DATE | Date of the parliamentary session | |
| sequence_number | INTEGER | Order of the paragraph within the session | |
| speaker_extracted | VARCHAR | Extracted speaker name | |
| speaker_id | VARCHAR | StatePol speaker identifier | |
| speaker_name | VARCHAR | StatePol speaker name | |
| affiliation | VARCHAR | Political party or functional affiliation | |
| content | VARCHAR | Content of the paragraph |