Data Scientist
Denver, CO, United States
We're Proof, a fast-growing startup in the legal tech industry. Our best-in-class legal services platform is used by thousands of law firms throughout the US, and we've recently raised our Series B from top-tier investors. We're looking for a talented Data Scientist to join our world-class team and drive the application of AI and ML to make our operations more efficient. This is an exciting opportunity to be the first Data Scientist at a rapidly scaling company and work closely with our CTO and VP of Product.
At Proof, we believe in the power of experimentation and data-driven decision making. While we don't see AI as a silver bullet, we are convinced that our unique position in the legal industry provides us with a real use case where this technology can drive immense business value and become our competitive advantage. Our extensive dataset of legal documents and deep understanding of the legal domain puts us in a prime position to harness AI and ML to streamline our operations and provide unparalleled service to our clients.
As our first Data Scientist, you will have the opportunity to shape our AI strategy from the ground up. You will work on complex problems, influence the tools and techniques that we use each day, and see the direct impact of your work on our business and customers. If you're excited about applying your skills to real-world challenges and being at the forefront of AI innovation and application in the legal industry, this is the perfect role for you.
What you'll do:
Develop and optimize NLP, NER, and classification models to automatically extract information from legal documents, reducing manual transcription
Collaborate with engineering to build data pipelines and deploy, monitor, and maintain ML models into our production environments
Conduct statistical analysis on our business data to generate actionable insights and help guide pricing experiments
Fine-tune and adapt existing models for our specific use cases, balancing performance and cost-effectiveness
Maintain and optimize our data warehouse and ETL pipelines to ensure data quality and availability (we use Airbyte and BigQuery)
Utilize our extensive document dataset to experiment with and improve our AI capabilities
Communicate findings and recommendations to both technical and non-technical stakeholders
Contribute to the overall data strategy and help shape the future of AI at Proof
Proven track record of implementing ML solutions that drive business value
Partner with engineering to build data preprocessing tooling (e.g. PDF to text, OCR, handwritten text detection, page orientation detection)
What we'll expect you to know on day one:
3-5 years of experience in data science, machine learning, or a related field
Strong proficiency in Python and experience with relevant NLP & deep learning libraries (e.g. TensorFlow, PyTorch, spaCy)
Solid understanding of NLP/NER techniques and experience with tasks like information extraction and document/text classification
Ability to analyze complex datasets, draw meaningful conclusions, and propose data-driven solutions
Knowledge of data engineering principles and experience with tools like SQL and Spark
Excellent communication skills and ability to collaborate cross-functionally
Self-starter mindset and excitement about applying AI to real-world business problems
Great to have, but not required:
Advanced degree in Computer Science, Statistics, or a related quantitative field
Familiarity with optimization techniques to balance the needs of multi-sided marketplaces
Expertise using OCR to extract data from large PDF documents
Compensation and Benefits:
100% remote, work from anywhere in the US
Flexible paid time off and holidays
Equipment provided
Health care, vision, dental, disability insurance, and 401K options
Salary range $130,000-$160,000 based on experience and location
E-Verify
This company participates in E-Verify, for more information view the Participation and Right to Work Posters.
#J-18808-Ljbffr