Data Engineer (Founding Team)
Company: Fabrion
Location: Bodega Bay
Posted on: February 16, 2026
|
|
|
Job Description:
Job Description Job Description Data/ETL Engineer (Founding
Team) Location: San Francisco Bay Area Type: Full-Time
Compensation: Competitive salary early-stage equity Backed by 8VC,
we're building a world-class team to tackle one of the industry’s
most critical infrastructure problems. About the Role We’re
building a multi-tenant, AI-native platform where enterprise data
becomes actionable through semantic enrichment, intelligent agents,
and governed interoperability. At the heart of this architecture
lies our Data Fabric — an intelligent, governed layer that turns
fragmented and siloed data into a connected ontology ready for
model training, vector search, and insight-to-action workflows.
We're looking for engineers who enjoy hard data problems at scale :
messy unstructured data, schema drift, multi-source joins, security
models, and AI-ready semantic enrichment. You’ll build the backend
systems, data pipelines, connector frameworks, and graph-based
knowledge models that fuel agentic applications. If you've worked
on streaming unstructured pipelines, built connectors into ugly
legacy systems, or mapped knowledge graphs that scale — this role
will feel like home. Responsibilities Build highly reliable,
scalable data ingestion and transformation pipelines across
structured, semi-structured, and unstructured data sources Develop
and maintain a connector framework for ingesting from enterprise
systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs,
etc.) Design and maintain the data fabric layer — including a
knowledge graph (Neo4j or Puppygraph) enriched with ontologies,
metadata, and relationships Normalize and vectorize data for
downstream AI/LLM workflows — enabling retrieval-augmented
generation (RAG), summarization, and alerting Create and manage
data contracts, access layers, lineage, and governance mechanisms
Build and expose secure APIs for downstream services, agents, and
users to query enriched semantic data Collaborate with ML/LLM teams
to feed high-quality enterprise data into model training and tuning
pipelines What We’re Looking For Core Experience: 5 years building
large-scale data infrastructure in production environments Deep
experience with ingestion frameworks (Kafka, Airbyte, Meltano,
Fivetran) and data pipeline orchestration (Airflow, Dagster,
Prefect) Comfortable processing unstructured data formats: PDFs,
Excel, emails, logs, CSVs, web APIs Experience working with
columnar stores, object storage, and lakehouse formats (Iceberg,
Delta, Parquet) Strong background in knowledge graphs or semantic
modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph) Familiarity with
GraphQL, RESTful APIs, and designing developer-friendly data access
layers Experience implementing data governance : RBAC, ABAC, data
contracts, lineage, data quality checks Mindset & Culture Fit:
You’re a system thinker: you want to model the real world, not just
process it Comfortable navigating ambiguous data models and
building from scratch Passionate about enabling AI systems with
real-world, messy enterprise data Pragmatic about scalability,
observability, and schema evolution Value autonomy, high trust, and
meaningful ownership over infrastructure Bonus Skills Prior work
with vector DBs (e.g. Weaviate, Qdrant, Pinecone) and embedding
pipelines Experience building or contributing to enterprise
connector ecosystems Knowledge of ontology versioning , graph
diffing , or semantic schema alignment Familiarity with data fabric
patterns (e.g. Palantir Ontology, Linked Data, W3C standards)
Familiar with fine-tuning LLMs or enabling RAG pipelines using
enterprise knowledge Experience enforcing data access policy with
tools like OPA , Keycloak , Snowflake row-level security Why This
Role Matters Agents are only as smart as the data they operate on.
This role builds the foundation — the semantic, governed, connected
substrate — that makes autonomous decision-making and agent action
possible. From factory ERP records to geopolitical news alerts, the
data fabric unifies it all. If you're excited to tame complexity,
unify chaos, and power intelligent systems with trusted data — we’d
love to hear from you.
Keywords: Fabrion, Sunnyvale , Data Engineer (Founding Team), IT / Software / Systems , Bodega Bay, California