Socure Logo

Socure

Data Scientist II - Big Data R&D, Identity Graph & KYC

Reposted 3 Days Ago
Remote or Hybrid
Hiring Remotely in US
140K-170K Annually
Mid level
Remote or Hybrid
Hiring Remotely in US
140K-170K Annually
Mid level
The role involves developing graph algorithms, data pipelines for identity verification, and supporting data analysis for compliance products.
The summary above was generated by AI
Why Socure?

Socure is building the identity trust infrastructure for the digital economy — verifying 100% of good identities in real time and stopping fraud before it starts. The mission is big, the problems are complex, and the impact is felt by businesses, governments, and millions of people every day.

We hire people who want that level of responsibility. People who move fast, think critically, act like owners, and care deeply about solving customer problems with precision. If you want predictability or narrow scope, this won’t be your place. If you want to help build the future of identity with a team that holds a high bar for itself — keep reading.

About the Role

The Big Data R&D team is responsible for building the core identity graph and entity-resolution capabilities that power Socure’s KYC and compliance products. In this role, you will help develop graph-based algorithms and data pipelines on massive PII datasets, support modelers with high-quality features, and evaluate new data sources that feed our identity and fraud products. You will work closely with senior data scientists and engineers while developing your skills in large-scale ML, distributed systems, and graph analytics.

What You'll Do
  • Contribute to the design and implementation of machine learning, data mining, statistical, and graph-based algorithms to analyze very large datasets for identity verification and anomaly detection.

  • Analyze large datasets to help develop and refine entity-resolution and identity-matching algorithms that drive Socure’s KYC and compliance solutions.

  • Build and maintain components of data-processing pipelines (ETL, feature generation, normalization) using tools such as Spark/PySpark and AWS (e.g., EMR, S3).

  • Support senior data scientists with feature engineering, data exploration, error analysis, and A/B test setup for new models and signals.

  • Help evaluate new third‑party and internal data sources: profile data quality, design offline experiments, and summarize impact on coverage and model performance.

  • Implement and maintain SQL and Python/R code for data extraction, transformation, and validation; contribute to code reviews and basic testing.

  • Provide analytical support to compliance and regulatory product teams, including ad hoc investigations, simple dashboards, and data deep dives.

  • Communicate findings in a clear, structured way to peers and cross‑functional partners (Product, Engineering, Client Analysis), focusing on key insights and trade‑offs.

  • Work effectively in a fast‑paced, cross‑functional environment; demonstrate ownership of well-scoped tasks and follow through to completion.

What You Bring
  • Master’s degree with 2+ years of experience, or Ph.D. with 1+ years of experience in a data science or analytics role, or equivalent practical experience.

  • Proficiency in at least one general-purpose programming language used in data science (Python, or Scala).

  • Solid experience writing and optimizing SQL for large datasets; comfort working in data lake / warehouse environments.

  • Hands‑on experience with Spark or PySpark and common ML libraries (e.g., scikit‑learn, XGBoost, TensorFlow/PyTorch a plus).

  • Familiarity with UNIX environments and the AWS ecosystem (e.g., EMR, S3); Databricks experience is a plus.

  • Working knowledge of supervised/unsupervised ML and basic statistics (similarity measures, clustering, evaluation metrics).

  • Exposure to graph techniques or graph databases (Neo4j, AWS Neptune, GraphFrames) is a strong plus.

  • Bonus: experience with Elasticsearch or DynamoDB; workflow tools such as Airflow for automating data pipelines.

  • Ability to break down loosely defined problems, ask good clarifying questions, and iterate quickly with feedback.

Please note that sponsorship is not available at this time; and that you must be located within 45 miles of a talent hub to be considered.

Socure is an equal opportunity employer that values diversity in all its forms within our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
If you need an accommodation during any stage of the application or hiring process—including interview or onboarding support—please reach out to your Socure recruiting partner directly.

Follow Us!

YouTube | LinkedIn | X (Twitter) | Facebook

Similar Jobs

An Hour Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
110K-160K Annually
Senior level
110K-160K Annually
Senior level
Legal Tech • Software • Generative AI
Manage deal workflows for new business, renewals, and expansions; review pricing, discounts, contracts, and billing for policy and revenue-recognition alignment; run approval workflows and escalate complex deals; partner with Sales, CS, Finance, Legal, and RevOps; optimize quote-to-cash processes, track deal metrics, and implement AI-powered automation to improve efficiency and scalability.
Top Skills: Ai-Powered ToolsBi/Reporting ToolsCpq ToolsCrm PlatformsDealhubExcelGoogle SheetsHubspotSalesforce
An Hour Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
250K-300K Annually
Senior level
250K-300K Annually
Senior level
Legal Tech • Software • Generative AI
Build and own Eve's marketing site and GTM engineering stack: integrate and optimize external tools, design AI agents for sales and marketing, implement webhooks and middleware to sync product/CRM data, and create programmatic campaigns and internal tools in partnership with Marketing, Sales, RevOps, and Product to drive growth and automation.
Top Skills: Ai AgentsCRMCSSHTMLJavaScriptLlmsMarketing AutomationMiddlewarePythonSQLWebhooks
An Hour Ago
Easy Apply
Remote
USA
Easy Apply
180K-212K Annually
Senior level
180K-212K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Design and build machine learning systems for Coinbase, responsibly use generative AI tools and copilots, apply human-in-the-loop practices, and deliver measurable efficiency, cost, and quality improvements while collaborating in a remote-first environment with periodic in-person surges.
Top Skills: GeminiGenerative AiGleanLibrechat

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

  • Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
  • Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
  • Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
  • Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account