TABLE OF CONTENTS
General
Concepts and terminology
What is Ingestion?
Stack Internal's content ingestion feature leverages the power of AI to quickly turn your internal documents into high-quality Knowledge Objects (Q&A pairs) on your Stack Internal site. This allows you to take static knowledge scattered across your organization and make it dynamic, accessible content your users can easily locate and integrate into their workflows.
What is a Knowledge Object?
A Knowledge Object is an AI-created Q&A pair that has been reviewed, approved, and published by a user.
How do we submit content to the Ingestion process?
You can upload files (documents, images, etc.) directly on your site or by accessing an API file upload endpoint. Connectors allow you to connect Ingestion to an external tool (for example: Confluence Cloud) to ingest content automatically.
How many Knowledge Objects can we ingest?
Every Stack Overflow Internal site allows 100 Knowledge Objects to be ingested per month. You can purchase additional Knowledge Objects with one of several tiered pricing plans. See the "Subscription and Billing" section for more info.
Do all Stack Overflow Internal sites have Ingestion?
Ingestion is available only on Stack Overflow Internal Enterprise.
Who can access the Ingestion feature?
All site admins and moderators can access the Ingestion dashboard, as well as regular users with enough reputation. Site admins can define the user reputation threshold for Ingestion access
Security
Data handling and storage
Where is ingested data processed and stored?
Ingestion stores, processes, and retains data in the Microsoft Azure cloud. This includes "blob" storage for raw data and database storage for metadata. We apply strict tenant isolation and access controls to all data.
Does Ingestion comply with enterprise security standards (for example: SOC 2, ISO 27001)?
Yes.
How is sensitive or confidential content handled? Does the Ingestion system detect and flag it automatically?
Ingestion does not detect or flag sensitive or confidential data. Instead, we give admins and moderators tools to easily review and delete content before it's published.
How does Ingestion secure and protect our data?
Ingestion protects your data with multiple layers of security throughout the processing lifecycle. We encrypt all data in transit and at rest with Azure-managed and infrastructure-level encryption. Azure Document Intelligence ensures prompts, completions, and embeddings are not shared with other customers or used to train models without permission. We manage sensitive credentials (API keys, database connections) with Azure Key Vault.
Access and permissions
What permissions does Stack Overflow need to access external systems?
Ingestion permissions depend on the connector and service you're using. Refer to your site's Ingestion admin settings page (Admin settings -> Ingestion) and the Ingestion support articles for specific details.
Who authorizes and configures connector access?
Only site admins can enable and configure Ingestion connectors with their site's admin settings (Admin settings -> Ingestion).
How is authentication managed? Does Ingestion support SSO or OAuth?
As a feature of Stack Overflow Internal, Ingestion uses the same authentication standards and protocols as the main site. The authentication process depends on the specific connector or service you're using.
Does Ingestion have role-based access control for connecting systems as well as reviewing and publishing content?
Yes. Admins have full control, while moderators can manage content. End users that meet or exceed the Ingestion reputation threshold can also manage content. See table below.
Permission
| Permissions | Site role |
|---|---|
| Admin | - Enable/disable Ingestion - Enable/disable and configure connectors - Manually upload files for Ingestion - Upload files to API endpoint for ngestion - Delete uploaded files and resulting content - Set end user reputation threshold - Review, edit, delete, and publish AI-generated content |
| Moderator | - Upload files to API endpoint for ingestion - Manually upload files for Ingestion - Review, edit, delete, and publish AI-generated ontent |
| End user (by reputation threshold) |
- Upload files to API endpoint for ingestion - Manually upload files for Ingestion - Review, edit, delete, and publish AI-generated content |
AI models and governance
Does Ingestion use an LLM?
Yes. Ingestion currently uses ChatGPT‑4.1 mini. Stack Overflow will continue to evaluate and implement new LLMs for Ingestion as models improve.
Can I use my own model for Ingestion?
Not at this time.
Is data used by Ingestion used to train future models?
No. Customer data is not used to train large language models (LLMs), and external LLM providers do not have access to your data. All ingestion and AI-enhancement occurs within a secure, logically isolated environment In Azure to keep your data completely private.
Is the AI model isolated per site, or is it shared across instances?
Though all Ingestion jobs use the same LLM, customer data is isolated per tenant.
Can the customer opt out of AI-assisted ingestion or limit its use to specific systems?
Admins can enable or disable Ingestion sitewide, as well as enable or disable specific connectors.
Technical
Ingestion functionality
How long does Ingestion take?
Once an Ingestion job works its way through the queue, Ingestion will typically take just a few minutes depending on file size and amount of content. A large batch of files (for example: 100 files uploaded by API or pulled by a connector) could take an hour or more to process. 1,000 files could take a day or more.
How big of a file can I upload to Ingestion?
Files uploaded by manual upload or API have a size limit of 10 MB. There's no limit to the amount of content a connector can ingest.
How many Q&A pairs will Ingestion create from an uploaded document?
Depending on document size, Ingestion typically generates 5–15 Q&A pairs per document, and about 80% of those become usable Knowledge Objects (approved Q&A pairs). This will vary based on industry-specific use cases, variations in user review behavior, and differences in source content quality and structure.
How does the system handle errors or partial imports?
If any part of the Ingestion process fails, the job stops and its status shows as "Failed" on the Ingestion dashboard. You can then delete the job and source file by clicking its three dots button in the right-hand column of the Ingestion jobs list.
What if I uploaded the wrong document by mistake?
The Ingestion dashboard lists all Ingestion jobs. There an admin can delete any source file and its resulting Q&A pairs.
Can a document be too short or too long for Ingestion?
Ingestion makes no distinction regarding the size of a document in words or pages. The only limit is file size for manual and API file upload, which must be less than 10 MB.
Should I break a large document into smaller files before Ingestion?
If a file is larger than the file upload size limit of 10 MB, breaking it up will allow you to upload it in pieces. Otherwise, there's no practical benefit.
Does Ingestion offer any analytics or reporting?
- The Ingestion dashboard shows how many Q&A pairs have been created and published, as well as how many are currently in review.
- The Admin settings page shows the number of Knowledge Objects created for the current monthly period, remaining Knowledge Objects, and when the next period begins.
- Site admins can request additional Amplitude reports by contacting their Stack Overflow account representative.
Confidence score
What evaluators does the confidence score use?
We use a 9-dimension evaluator framework.
| Evaluator | Definition |
|---|---|
| Coverage | Q&A is well-scoped, not too broad or narrow |
| Knowledge Value | The answer provides genuinely useful information |
| Source Fidelity | Answer reflects the source document |
| Relevance | Answer directly addresses the question asked |
| Answer Depth | Answer covers the topic fully, not partially |
| Question Fluency | Question is well-written and natural |
| Answer Fluency | Answer is grammatically correct |
| Coherence | Answer logically follows from the question |
| Question Tone | Question sounds like something a real person would ask |
Why did my Q&A receive a ## score?
The review queue displays the nine individual evaluator scores and detailed rationale for every Q&A pair so you can better understand why a post received a particular score.
File upload
What file types does Ingestion support?
Ingestion supports the following file types for manual or API upload.
| File type | File extension |
|---|---|
| Microsoft Word document | .docx |
| Microsoft Excel spreadsheet | .xlsx |
| Microsoft Powerpoint presentation | .pptx |
| PDF document | |
| Image file | .jpeg, .jpg, .png, .heif, .tiff |
| Web page | .html |
| Markdown or plain text file | .md |
Confluence Cloud connector
Does Ingestion support Confluence permissions?
Confluence permissions are not maintained by the connector. All ingested and approved content will be visible to all users on your Stack Internal site. Because of this, we recommend starting with your most broadly accessible Confluence spaces first. That's usually where the highest-value knowledge lives and is the fastest path to ROI.
Admins should also be mindful about Confluence permissions when connecting Ingestion to spaces with sensitive or confidential information. After ingestion, any approved and published content will be visible to all users on your Stack Internal site.
How does the Confluence Connector handle duplicates?
The Confluence Cloud connector ensures only new pages are ingested to prevent overlap.
Integration architecture
What Ingestion connectors are available?
Ingestion currently offers a connector for Confluence Cloud, as well as direct file upload and a Stack Overflow API v3 endpoint. We're actively developing other connectors.
How does Ingestion handle ownership of ingested content?
For manual and API file uploads, Ingestion attributes the resulting Knowledge Objects to the user that uploaded the file. For content ingested by a connector, Ingestion retains ownership by mapping the user's email address on Stack Overflow Internal to the external system.
Is ingestion a one-time import or can it be continuous sync?
File and API upload are one-time ingestion processes. The Confluence Cloud connector offers continuous sync, scanning all Confluence Pages daily to ingest new documents.
What’s the process for de-duplication or conflict resolution if the same content exists in multiple sources?
On file or API upload, Ingestion scans existing site content to eliminate redundant content. The Confluence Cloud connector ensures only new or updated content is ingested.
Are there APIs for programmatic ingestion or export?
Yes, we have an API v3 endpoint for file Ingestion. Learn more in the Ingestion Quickstart article.
Content review and publishing
How does the AI-generated content integrate with other content on the site? How can we be sure Ingestion won't fill our site with "AI slop"?
Ingestion uses "human in the loop" AI to integrate AI-generated content into the same trusted workflow your site uses now. After ingestion, all AI-generated Q&A pairs remain unpublished until a human reviews, edits, and approves the content. The Ingestion process tags all AI-generated content with the "ai-assisted" tag for transparency.
No AI-generated content appears on your site until an admin, moderator, or authorized user approves it.
How transparent is the Ingestion content score?
The Ingestion "Content score" box displays sub-scores for all nine evaluators used to determine the quality of the generated content.
Can customers define retention, review, or re-verification intervals for ingested content?
Once published to the site, ingested Q&A pairs fall under the same review and revision processes as all other content. Learn more about Content Health.
Does Ingestion notify content reviewers of new ingested content?
As with the existing content health system, users will see new ingested content in their “For you” notifications and “For you” home page area.
If I edit the ingested Q&A pair, will its content score change?
No. The content score is based on the initial AI-generated content. The Ingestion system will not rescore a Q&A pair after editing.
If I edit the ingested Q&A pair, is the original source content also updated?
Ingestion is a one-way process. Edits made to ingested content do not write back to the original source material, whether that's an uploaded file or content pulled from a connector.
I click on the uploaded file name on the Ingestion dashboard, but I don't see any questions on the review page. Why?
When you click on an uploaded file, it sets the content review filter to show only unpublished Q&A pairs from that document. If all Q&A pairs from that document are already published (or none were initially created), the review screen will be blank. Clear the review page filter to show an unpublished Q&A pair from a different document, or return to the Ingestion dashboard.
I'm reviewing posts created from a document someone else uploaded. How can I view the source document?
Ingestion does not make the original file available for manual or API uploads. During review, users can link to the original Confluence space and page for content ingested through the Confluence Cloud connector..
Can I remove the "ai-assisted" tag?
Yes. Content reviewers can add or remove tags as part of the review process.
Subscription and billing
Pricing and Ingestion limits
How is Ingestion priced?
Stack Internal Ingestion is a subscription-based add-on. All Stack Internal Enterprise sites receive a base allotment of 100 Knowledge Objects per month so users can experience the power of the Ingestion tool. For higher volume needs, organizations can purchase premium tiered add-ons. For more information, see the Ingestion Subscription and Billing article.
What is a Knowledge Object?
A Knowledge Object is an AI-created Q&A pair that has been reviewed, approved, and published by a user. The Knowledge Object is the core metric used to measure ingestion capacity.
What if we hit our Ingestion plan's limit?
When you hit the Knowledge Object limit of your Ingestion plan, Ingestion enters a "frozen" state. Your users will not be able to upload files manually or by API, and all connectors will stop ingesting content. Unless you upgrade your plan, Ingestion will remain frozen until the beginning of your next plan cycle.
Your Stack Overflow account representative will work with you to advise on the best Ingestion tier for your use case.