Back to Jobs

Data Scientist Required for IRS 990 Analysis and NLP in Public Policy Research

Remote, USA Full-time Posted 2025-11-24
We are looking for a computationally strong data expert for a fixed-duration research engagement for 3 weeks for an empirical project in the nonprofit–government policy space. The project involves large-scale administrative data, open government datasets, and advanced NLP, with the goal of producing a results-ready analysis suitable for submission to a top-tier journal in nonprofit studies or public administration. Because the project contains a novel methodological contribution, a nondisclosure agreement (NDA) will be required before full project details, data architecture, and analytic framework are shared. However, the broad technical skillset needed is listed below so that interested candidates may assess fit. Required Expertise (General Outline) Data Engineering & Administrative Data Parsing and transforming large, semi-structured or unstructured public datasets (e.g., XML/JSON) Building reproducible Python ETL pipelines Managing data at scale (millions of text records) Natural Language Processing Experience with at least one of the following: Topic modeling (LDA and/or embedding-based methods) Clustering techniques such as UMAP + HDBSCAN Text classification or thematic modeling workflows Working with transformer-based embedding models Statistical Modeling Familiar with: Panel data regression models Time-series alignment or co-movement analysis Similarity metrics (e.g., cosine, correlation) Robustness testing and model validation Research Communication Ability to document methods explicitly and clearly Experience drafting or co-drafting Methods and Results sections for academic publication Comfort preparing reproducible files (notebooks, GitHub structure, workflow notes) Engagement Details Duration: 3 weeks (full-time or near full-time) Nature of work: Analytic + computational + methodological documentation Output: Clean datasets, documented code, analytic results, and draft text suitable for journal submission Confidentiality: NDA required before project description, data schema, or methodology is disclosed Ideal Candidate Quantitative / computational postdocs Researchers in public policy, computational social science, economics, political science, sociology, or information science Applied data scientists with experience in text analytics or public-sector data Anyone excited to work on a well-scoped, well-funded research project with real publication potential s Apply tot his job Apply To this Job

Similar Jobs