Context
I recently received a timely email from Digital Ocean about how you can build your RAG Pipeline in 5 Steps using the Gradient Platform. The email was quite straightforward and it broke down all the steps quite clearly.
I got an idea right away, what if I could build out a RAG Pipeline on some resources I have been reading and collecting around AI Governance in the context of Civil Society Organizations ? I wanted to see if this could make it easier for me to potentially find or even directly reference information I already have around this topic. I figured, why not implement this to evaluate how great this could work as an experiment. I also had some Digital Ocean credits left over (about $176 or so) after getting some during Deep Dive Week at work last month.
Prerequisites
- A valid Digital Ocean Account
- Downloaded Resources that you wish to use as part of your knowledge base (PDFs, Docs, CSVs etc).
- Digital Ocean credits (To be able to pay for Digital Ocean Spaces and the indexing tasks when creating the Knowledge Base)
Setting up Digital Ocean Spaces to be used as our Data Sources
First, we want to make sure that we have everything we need before we create a knowledge base on the gradient platform. We’d love to use Digital Ocean Spaces as our Data Source too. (This step is optional as you can be able to upload your resources directly to the knowledge base. If you don’t need this, simply move forward to step 2)
Choosing Region of the Bucket
We start off by selecting the region of the Spaces Bucket
Since we can potentially use this Space for much more than just as a data store for this RAG project, we enable the CDN feature just in case we may need it.
Naming our Spaces Bucket
Next we name our Spaces Bucket
Understanding Cost implications of using Spaces on Digital Ocean
Total Monthly cost is about $5/month for this. This base rate of a Spaces subscription gives us the ability to create multiple Spaces buckets. It also includes 250 GiB of data storage (cumulative across all of your buckets). Additional storage beyond this allotment is $0.02 per GiB per month.
This should genuinely be more than enough for this since we won’t be using this for production use-cases, just experiments. We go ahead and create the Spaces Bucket.
Bucket created
We now are in a position where we can upload the data sources into our Bucket. Let’s do that at this point. For me, my workflow will be super simple, I will create a folder locally that has all the information I need then upload it using boto3.
Uploading data to the Spaces Bucket
The folder I want to upload is called AI-Compliance on the directory /Users/brian/documents/Books in my Mac.
I go ahead and generate access keys for the bucket rag-experiments by simply clicking on the settings option then finding the Access Keys option then we Create Access Key.
We will go with Limited access
Once you have clicked on create, we get our secret key and Access Key ID. (You can only view the secret key once btw)
Now that we have this we create our files
cd /Users/brian/Documents/Bookstouch upload_script.pypip3 install boto3python3 upload_script.py
Our upload_script.py looks like this
import boto3from boto3 import sessionfrom botocore.client import Configfrom boto3.s3.transfer import S3Transferimport os
#Use the API Keys you generated at Digital OceanACCESS_ID = 'xxxxxxxxxxxxxxxx'SECRET_KEY = 'xxxxxxxxxxxxxxxx'
# Initiate sessionsession = session.Session()client = session.client('s3', region_name='ams3', #enter your own region_name endpoint_url='https://ams3.digitaloceanspaces.com', #enter your own endpoint url aws_access_key_id=ACCESS_ID, aws_secret_access_key=SECRET_KEY)
transfer = S3Transfer(client)
# Upload entire folderfolder_path = '/Users/brian/Documents/Books/AI-Compliance'for root, dirs, files in os.walk(folder_path): for file in files: file_path = os.path.join(root, file) # Keep folder structure relative_path = os.path.relpath(file_path, '/Users/brian/Documents/Books') print(f"Uploading {file_path} as {relative_path}") transfer.upload_file(file_path,'rag-experiments', relative_path) # Make it public if needed #response = client.put_object_acl(ACL='public-read', Bucket='name-of-space', Key=relative_path)
(You can find the gist of the code here )
The upload process will take a while, but it should work seamlessly.
Creating the Knowledge Base on the Gradient Platform
Next, we go ahead and create a Knowledge base. We simply click on create on Digital Ocean’s menu - then we click on Knowledge Bases.
Naming our Knowledge Base
We now configure our knowledge base, starting with our Knowledge Base name - we name it - ai-compliance-for-csos.
Selecting our Data Source for the Knowledge Base
Next, we have the option to select data sources to index, we click on Select Data Source
We select Spaces Bucket or Folder
The Digital Ocean Spaces Bucket we made earlier appears here and we select it right away and add it as a data source
On the menu, you can now see that it has been selected
Selecting the location of the Knowledge Base
We have the option to select where the knowledge base will live and we also select Amsterdam for this.
Choosing the Embeddings Model
Next, we choose the embeddings model. I’ve looked into GTE Large EN V1.5 and I believe that this is the best choice for our experiments. It’s also the most expensive embedding model on Digital Ocean’s platform.
We finalize and create the knowledge base.
Knowledge Base Creation Process
Creating the OpenSearch Database
First, it’ll create the OpenSearch database for our knowledge base then indexing will begin automatically shortly after
This takes some time, depending on the size of your knowledge base, so feel free to proceed with your other activities as you wait.
Indexing event queued
Once the indexing event is queued, you get this notification
Remember, this only happens after our OpenSearch DB is ready - and we can actually go check it out. The credentials will be provided right on Digital Ocean’s UI.
Indexing in Progress
When the indexing is finally in progress, you’ll get this notification
After about 6-8 mins, it finishes the indexing.
Costs of the indexing task
The total cost of this indexing task cost was about $0.02917.
Creating your own AI Agent
Next, we need to create our AI Agent that references our knowledge base.
Naming our AI Agent
Now, we name our agent.
Setting up Agent Instructions
Next we set up agent instructions
This is the prompt we will be using:
You are an AI Compliance Assistant specializing in ethical AI governance for Civil Society Organizations (CSOs). Your knowledge base contains comprehensive resources on AI ethics, data privacy regulations, and responsible AI implementation frameworks.
Your Core Expertise
You have deep knowledge of:
- Global AI ethics frameworks and governance standards
- Data protection regulations and AI-specific compliance requirements
- Privacy-preserving techniques and responsible data practices
- Humanitarian AI guidelines and sector-specific toolkits
- Information security certifications and standards
- Ethical AI implementation for vulnerable populations
- Non-profit specific AI governance resources
How to Respond
When answering questions:
- Prioritize practical, actionable guidance suitable for resource-constrained CSOs
- Cite resources by their official titles (e.g., "The AI Ethics for Nonprofits Toolkit recommends..." or "According to the Data Protection Risk Assessment Framework...")
- Consider the CSO context - limited budgets, diverse stakeholders, mission-driven goals
- Flag compliance requirements based on jurisdiction when relevant
- Suggest step-by-step implementation approaches when discussing complex topics
Key Principles to Emphasize
- Human rights-centered approach: All AI deployments should respect fundamental human rights
- Transparency and explainability: CSOs must be able to explain AI decisions to beneficiaries
- Data minimization: Only collect and process data necessary for the stated purpose
- Inclusive design: Ensure AI systems don't exclude or discriminate against vulnerable groups
- Accountability: Clear governance structures and responsibility assignments
Common Topics to Address
- Conducting AI readiness assessments for non-profits
- Writing data protection impact assessments
- Implementing privacy-preserving techniques on limited budgets
- Building ethical AI governance structures with board oversight
- Ensuring beneficiary consent and data rights
- Cross-border data transfer compliance for international CSOs
- Bias detection and mitigation in humanitarian contexts
Response Style
- Use clear, jargon-free language accessible to non-technical stakeholders
- Provide concrete examples relevant to CSO operations
- Offer alternatives for different resource levels (low-budget vs. well-funded options)
- Include risk assessments focusing on impact to vulnerable populations
- Suggest templates, checklists, or tools when available from the resources
Important Reminders
- Always consider power dynamics between CSOs and beneficiaries
- Emphasize that compliance is not just legal but ethical obligation
- Recommend participatory approaches involving affected communities
- Note when professional legal counsel may be needed
- Highlight free or low-cost resources specifically for non-profits
- When you don't have specific information about a compliance requirement or framework, clearly state this and suggest authoritative sources for further research. Never guess about legal requirements.
Selecting an AI model for our Agent
When selecting a model, we’ll go with DeepSeek R1 Distill Llama 70B
It’s the same size as Llama 3.3 Instruct (70B) but I love how it performs better having tested and used it often.
Adding Agent to a Workspace
Final thing, adding where the agent lives - the workspace name and description (these characters are quite limited so don’t have too many characters for the description especially)
Now we need to add knowledge bases to this agent as it is not attached automatically.
Finalizing Setup
Then we finalize the last steps and create an agent
The agent is now being created
The AI Agent is finally done
Testing the AI Agent in the Playground
The best part about Digital Ocean is that it has the playground that actually lets you test the model right on the Playground
Let me ask this model a sample question around AI Compliance :
Prompt used
We want to develop a comprehensive AI adoption Framework for Civil Society Organizations, based on the knowledge you have, can you please give us 5 of the best resources to read and reference for this task ? The resources must be directly from your knowledge base
Response
To develop a comprehensive AI adoption framework for a Civil Society Organization (CSO), consider the following five key resources, each offering unique insights and guidance tailored to the needs and constraints of CSOs:
1. AI Ethics for Nonprofits Toolkit
Why It's Useful: Specifically designed for nonprofits, this toolkit provides practical, actionable steps for ethical AI adoption, considering limited budgets and resources. It emphasizes transparency, accountability, and inclusivity, ensuring AI aligns with the organization's mission.
2. ISO/IEC 42001 Standard and Checklist
Why It's Useful: As an international standard, ISO/IEC 42001 offers a robust framework for managing AI, ensuring trustworthiness and accountability. The accompanying checklist aids in practical implementation, making it easier for CSOs to meet compliance and ethical standards.
3. AI Safety Governance Framework
Why It's Useful: Focuses on safety and risk management, crucial for CSOs working with vulnerable populations. It provides guidelines to prevent harm and ensure AI systems are used responsibly, addressing both technological and ethical risks.
4. OECD Principles on Trust in AI
Why It's Useful: These principles, from a reputable international organization, emphasize transparency, accountability, and human oversight. They offer a global perspective, helping CSOs align with widely recognized ethical AI standards.
5. India's AI Governance Report
Why It's Useful: Provides insights into implementing AI in diverse and challenging environments, particularly relevant for CSOs operating in regions with unique regulatory and cultural contexts. It offers a tailored approach to AI governance.
These resources collectively address ethics, governance, safety, trust, and regional considerations, essential components for a comprehensive AI adoption framework in CSOs.
This is the reasoning trace from our AI Agent
Okay, so I need to help a Civil Society Organization (CSO) develop a comprehensive AI adoption framework. I'm not very familiar with AI governance, but I'll try to break this down step by step.
First, I should understand what a CSO is. From what I know, CSOs are non-profits or NGOs that work on social causes. They might not have a lot of resources, so any framework needs to be practical and not too expensive.
I remember reading something about AI ethics and governance in the context of CSOs. There were some specific toolkits and guidelines mentioned. Let me think... Oh right, the AI Ethics for Nonprofits Toolkit. That sounds relevant because it's tailored for nonprofits, which are a type of CSO. It probably has actionable steps and considers limited budgets.
Another thing I recall is the ISO/IEC 42001 standard. It's about managing AI in organizations. Since it's an international standard, it might provide a solid foundation for any AI adoption framework. Plus, there was a guide or checklist that came with it, which could be really helpful for implementation.
Then there's the AI Safety Governance Framework. Safety is crucial, especially for organizations dealing with vulnerable populations. This framework likely covers risk management and ethical considerations, which are important for CSOs to ensure they don't inadvertently cause harm.
I also remember something about the OECD Principles on Trust in AI. The OECD is a reputable organization, and their principles are widely recognized. They focus on transparency, accountability, and human oversight, which align well with the values of CSOs.
Lastly, the India-specific AI Governance Report was mentioned. It might offer insights into implementing AI in regions with specific challenges, which could be useful for CSOs operating in diverse areas, especially in countries with unique regulatory environments.
Putting it all together, these resources seem to cover ethics, governance, safety, trust, and regional considerations, which are all essential for a comprehensive AI adoption framework. I should list these five resources, explaining each briefly and why they're useful for CSOs.
I was actually quite impressed by these responses. I tested it extensively even on complex prompts and it delivered every time.
Conclusion
This marks the end of this article. From here, I had a bunch of different options I could explore, like embedding my model as a chat interface on a platform and let it answer user queries or connecting it to open source chat apps and basically running it locally. This was merely an experiment to see how seamless this process is.
I also didn’t run actual evals to test the performance of the Agent but I felt confident because this is a topic I am well versed in and the RAG Agent we built truly delivered.