Summary
We determine the specific needs and challenges of marginalized communities that AI can address and evaluate the technical feasibility and potential impact of different AI applications. The more narrow the use case, the easier it will be for your team to evaluate.
We recommend defining the following as clearly and succinctly as possible before proceeding:
- [ ] Understand the problem: you and your team can clearly articulate the problem or opportunity the solution addresses
- [ ] Specify Goals: determine the outcomes/outputs (e.g. summarization, text generation, image creation)
- [ ] Identify Stakeholders: Identify who you are including input and feedback from at each step, including technical and domain expertise and the end users
Evaluate Technical Feasibility
- [ ] Technical Challenges: Identify and address limitations or risks.
- [ ] Cost Analysis: Estimate costs for development, deployment, and maintenance.
- [ ] Comparison with Alternatives: Benchmark against existing tools or processes.
Evaluate Data Requirements
- [ ] Data Sources: identify the sources and types of data (APIs, spreadsheets, PDFs, images, etc)
- [ ] Data Quality and Quantity: Assess the quality, volume, and relevance of available data
- [ ] Data Risk: Evaluate the data for risks:
- [ ] Bias: do the datasets contain bias?
- [ ] Privacy: do the datasets contain sensitive information and/or any information that could be used to identify a person?
- [ ] Preprocessing Needs: Plan for cleaning, labeling, and augmenting data if necessary
Assess Model Requirements
- [ ] Model Selection: Research and evaluate existing generative models (e.g. Gemma, Llama, etc) preferring small, open-source models where possible
- [ ] Customization: Determine if pre-trained models suffice or if fine-tuning is necessary.
- [ ] Performance Metrics: Define success metrics (e.g., accuracy, coherence, realism).
Determine Infrastructure and Scalability
-
[ ] Compute Requirements: Assess the hardware needed (e.g., GPUs, TPUs, cloud services).
Many LLM hosting services charge by the token
- Cost can become a limiting factor quickly, which may restrict LLM size and whether it is available 24/7 or on-demand.
- Many LLM hosting services charge by the token, which can making implementing an LLM much more cost effective. However, there are typically trade-offs:
- Not having the ability to pre-train or fine-tune a model
- Not having full control on what the hosting company does with your data. PII should not be sent to a model regardless of where it is hosted, but our clients may have strict guidelines on how our models are hosted and by what organizations