Engineering at OSlash
From code to
cutting-edge
Unveiling the technological expertise that powers Generative AI at OSlash
Extract
OSlash Copilot automatically & recursively extracts as well as syncs data from your data source
Encode
Our rigorously trained & optimized few-shot models help improve embeddings & minimize latency
Index
Generate robust vector indexing on any database up to 5000 pages with near-zero latency
Search
Boost search accuracy & relevance with optimized response streaming & multi-dimensional reranking
Execute
Generate AI-summaries for search queries or execute multi-step workflows using natural language
Proprietary
intent detection
The IntelliDetect is a powerful, proprietary few-shot intent detection model that brings with it the benefits of data efficiency, quick adaptation, and reduced annotation costs.
It is based on Google’s state-of-the-art intent detection model and is designed and trained to achieve an accuracy of up to 99% for classifying user intent as either a search intent or ask intent.
IntelliDetect’s lightning-fast response time ensures near real-time intent classification resulting in reduced latencies, faster results, and a better customer experience overall.
Robust vector indexing
on any database
Index 5000 pages in <1 min
OSlash Copilot follows URLs & traverses documentation iteratively ensuring a comprehensive & up-to-date index of your content.
It intelligently connects to the metadata associated with content, enhancing the its ability to understand the meaning & relevance of the data.
Higher performance, lower price
OSlash Copilot uses the latest embedding model based on the GPT-3 tokenizer. This makes it compatible with a wide range of NLP tasks.
The token window of 8191 tokens also makes it suitable for processing large documents. It's highly performant, affordable, and scalable to boot.
Overcome token limits
To overcome token limitations, which can hinder complex queries, our model utilizes a technique called chunking. It breaks complex documentation into smaller chunks.
By efficiently processing and dividing lengthy queries into manageable chunks, the model effortlessly handles even the most extensive and intricate search requests on your data.
Fastest LLM-powered search experience
Query optimization
OSlash Copilot uses query caching to avoid repetitive computations for similar queries. It leverages search-specific algorithms, such as ranking algorithms, for relevance sorting & feedback for query expansion.
Preprocessing & filtering
OSlash Copilot eliminates unnecessary information & filter irrelevant data before the actual search process.This reduces the search space and speeds up search operations.
Partial search & incremental updates
OSlash Copilot implements partial search mechanisms to provide intermediate results while the search is ongoing.
For frequently updated data, we use incremental updates to keep search indexes up to date without re-indexing the entire dataset.
Parallel processing
We utilize multi-threading or distributed processing to perform search tasks in parallel.
This can significantly boost search speeds, especially for large-scale applications and datasets.
Compression & serialization
We compress data and use efficient serialization techniques
This reduces data transfer times between components, especially in distributed search systems.
Benchmarking & profiling
We regularly benchmark and profile our search system to identify performance bottlenecks.
The insights help us fine-tune and optimize the architecture.
Natural-language workflows
Our proprietary TaskSynth classifier model has 3 components working in tandem to generate workflows from natural language commands.
Task Builder
The Task Builder is what classifies an incoming user query as a task and prepares the list of actions that need to be executed. This list is fed to the Task Validator.
Task Validator
The Task Validator checks the list of tasks generated for missing details or inconsistencies, if any. If the task has all necessary details for execution (executable task), it passes the instructions on to the Task Executor. Else, it asks the user for further details.
Task Executer
The Task Executer implements the executable task and displays the corresponding response/launches the corresponding workflow in the frontend of the application.
Cost-optimized
LLM usage
Caching & compression
We cache the input text and prepopulate queries so that when a user inputs similar sentences, we can retrieve the cached response instead of making a new API call.
Compressing cached data further optimizes resource utilization and reduces associated costs for us.
Time-based expiration
We implement a time-based expiration policy for cached data .
This helps us serve reasonably fresh data and avoid outdated
or incorrect information.
or incorrect information.
Content-based hashing
We employ content-based hashing techniques to store unique LLM responses in the cache.
This way, we avoid redundant caching of responses for similar queries, saving storage space and memory.
Warming-up the cache
We proactively load frequently used data into the cache during system startup or when the cache is empty.
This helps minimize cache miss rates and ensures that popular data is readily available.
Security-first
approach to AI
Whether you opt for on-site installation or cloud-based installation for OSlash, we follow the same set of guidelines and best practices to ensure that your data is secure at all times.
We understand that our users entrust us with their information, and we are committed to ensuring it remains safe. We do not store any personally identifiable information (PII) of our end-users.
We only require organizations to provide a uniquely identifiable key for each user when making requests, to ensure analytics can be mapped to the correct users. We only collect this unique ID provided by the organization, with no access to any other user data.Read the OSlash trust guidelines in detail here.
Case Study: How internal shortlinks save Stripe over 20,000 hours every week
The origin story of internal shortlinks for higher productivity at Stripe