Introduction to Innovative Solutions for Box

ABBYYInnovations for Box

In this series of solutions we will document several innovate solutions for Box that bring unique value or compliment Box’s highly-secure document collaboration service.  Our intention is to inspire creative thinking about all the great possibilities around using the Box platform.  Some solutions will use existing capabilities of various piece-parts from different vendors that when put together and configured properly, form a solution.  While with others that we will describe, the solution would require some additional integration work such as creating Web Services connectivity, primarily.  However, even when some additional integration work is required the goal of these solutions is that most of the solution is already realized and we want to avoid long development times or costly integration fees by introducing too much complication.  We will not cover every single step in the configuration process but we will still be thorough.  Software installation will not be covered, it will be assumed.  Just like the Box experience we strive to offer a great user experience that is highly-effective to enhance productivity in your professional and personal life.

We have created a schedule of proposed solutions to share with you however, your input is what is of most importance to us, and therefore we are flexible to modify our schedule.  If there is something specific you would like to see documented, or if there is a solution idea you would like to suggest, then please contact us and we can consider creating something custom.  We’d love to hear from you!


Innovations for Box (click link below for more information)

  1. File Conversion for Box
  2. Image Import with Cloud Conversion for Box
  3. ERP with Automatic Data Capture for Box
  4. SharePoint with Automatic Data Capture and two-way syncing for Box
  5. High-Volume Scanning and Conversion with WebDAV upload for Box
  6. Exploiting Big Data with Indexes for Box
  7. Mobile with Automatic Data Capture for Box
  8. Systems of Record and High Collaboration for Box
  9. ABBYYForce for Box

Innovative Solutions for Box poll

We value your opinion!  Please cast your vote in our poll for your favorite solution below.

Innovative Solutions for Box (click link below for more information)

  1. File Conversion for Box
  2. Image Import with Cloud Conversion for Box
  3. ERP with Automatic Data Capture for Box
  4. SharePoint with Automatic Data Capture and two-way syncing for Box
  5. High-Volume Scanning and Conversion with WebDAV upload for Box
  6. Exploiting Big Data with Indexes for Box
  7. Mobile with Automatic Data Capture for Box
  8. Systems of Record and High Collaboration for Box
  9. ABBYYForce for Box

Frankie-the-frustrated worker: Making Salesforce better with Automatic Data Entry

frankie the frustrated iconFor this particular blog post we would like to use a light-hearted approach to a major problem.  The problem is lost productivity and user frustration around populating data into Line of Business applications via Manual Data Entry versus Automation.

To illustrate my point let’s take one of the most popular Software as a Service (SaaS) applications ever,  And while the application is absolutely simple to use and easy to manage, what lacks is the ability to take information from paper and/or an image and put it directly into database fields.


1.  Let’s take a moment to go through the steps to import data into and follow the steps Frankie-the-frustrated worker must take to get this task done.
2.  Commentary of Frankie-the-frustrated worker:
“Frustrating!  Step 1 of 7????”
3.  Commentary of Frankie-the-frustrated worker:
4.  Commentary of Frankie-the-frustrated worker:
5.  Commentary of Frankie-the-frustrated worker:
 “FORGET IT!!!!!!!!!!!! 
 THIS WILL NEVER END!!!!!!!!!!!!!!!!! 
 ISN’T THERE AN EASIER WAY????????????????” 


Education and modern technology reduce Frankie’s frustration

Are we still living in the stone age when it comes to data entry into computer systems?  Isn’t there a more efficient method to automatically populate data in your software application instead of costly manual data entry?  It’s 2014 after all, not 1914.  Why do we accept such primitive methods of data entry?


Answer:  Because we need to educate the market on the capabilities of capture technologies.  We also need to strive to make integration and usage as easy as possible.  If you build it, they will come.


frankie the frustrated 2014 slide
Eliminating Frankie’s frustration with Ubiquitous Information Capture
Realizing the dream of Ubiquitous Information Capture directly into applications is much easier than you might think but we must educate the market on current capabilities. The idea is simple, yet highly effective.  Embed the ability to take photos with a smart phone and/or capture paper documents from a scanning device directly into your software application.  Note that all I’ve done in the screen prints below is add a small icon of a camera and scanner directly into my CloudConnectMashup software application.


Now, I can offer my users a truly great user experience because contributing information is nearly effortless and removes pain associated with manual data entry.  This translates directly into reduced operational costs, improved efficiencies and an overall better work environment.
Think about all the lost opportunities to drastically reduce labor costs, most likely in the billions if not trillions of dollars, associated with manual data entry in just the use cases below:


1.  Transportation applications with Bills of Lading, Proof of Deliveries, Trip Sheet or Scale Tickets


2.  Field Service applications with Proof of Work delivered, Vehicle Identification Number, Work Orders or Assessment documentation


3.  Contracts Management applications with Amendments, Terms and Conditions or License Agreements


4.  Invoice Management applications with Invoices, corresponding Packing Lists or Proof of Performance


5.  Sales/Contact Relationship Management applications with Business Cards, Agreements or Correspondences


Do you know a Frankie in your organization?  Do you have a story, good or bad, to tell?  We’d love to hear your feedback.

Scan-to-Alfresco technology options

Product Description
Acxio AGIA : ACXIO Generic Injector for Alfresco
ChronoScan We offer a full featured Scan & Index solution with OCR and Barcode Read, PDF Text Output, it’s free with nag screen and 130USD to buy, it can direct export to Alfresco through CMIS protocol
Corium Corium is dedicated to building and providing you with the best solutions for your data capture, document management and content management business requirements. Corium developed “Librex Capture”, a tool that allows you to capture all your content in a simple, unified and structured way, whether you have to scan paper documents, capture emails, faxes or import electronic documents automatically or manually. Also, Librex performance tools allow you to automate your content metadata extraction (OCR, Barcode, Data lookup). Its smart connectors allow you to transfer your documents to various systems in a structured way and following your business rules for automatic classification. Librex offers a smart connector to Alfresco. For more information, see
Ephesoft If you want to do serious recognition or classification, separation, and metadata extraction of data similar to the big players (Captiva, Kofax, Readsoft), You should check Ephesoft. There’s a free community version and an enterprise one with commercial OCR and other benefits including full support. There’s no click charges and it’s all browser-based written in JAVA. It exports to XML or to any CMIS compatible system like Alfresco. A lot of power for no initial outlay–just the support payment.
Ezescan Ezescan is the ideal scalable scanning solution. The Product has had many years of integration with other DM systems, there are about 800+ sites accross the globe using the solution for various things like general correspondence, ad-hoc scanning, invoices & forms. You can scan as many documents as you like, all without a per page click charge. The Product will OCR, ICR, BCR, IDR depending on which modules you choose and best of all, its just been released for Alfresco. You can also upload to other line of business systems at the same time as well, scan once, upload multiple times.
InstaCapture Quikcapture™ is a feature rich batch capture product designed to capture and transform high volumes of paper documents into text-searchable electronic documents.  With a rich feature set to streamline and automate the document capture process, it enables organizations to rapidly transform paper assets into actionable knowledge assets.  It offers compelling features to completely automate the capture process with automatic indexing, automatic document recognition (ADR), zonal OCR, full-text search and searchable PDF’s. Quikcapture™ is effective both as a standalone document capture solution and a distributed capture solution for your existing document management systems; it offers readily configurable connectors to FileNet, Alfresco, SharePoint and DB2 Content Manager.
IP-TECH It depends on how many papers to scan.. for unit scanning, we use a connector that sends the document to Alfresco with an option to include its metadata (by the way, we adapted the connector to work on ScanSnap scanners as a ScanSnap Manager’s custom profile).
IRIS IRISPowerscan™ will identify and separate the scanned documents automatically through barcodes, patch codes, text detection, or even document layout alone! Once their type is known, documents can be indexed by extracting key data for further processing—High-Speed-Scanning-Software.aspx
Kodak Capture Pro Convert forms, invoices, patient records and other critical business documents to high-quality images quickly and efficiently. Capture critical index data and automatically deliver it to databases and applications. Get powerful, flexible batch capture functionality in desktop to high-volume production capture environments.
Kofax Integrating Kofax and Alfresco provides complete Content Management support including capture, management and publishing of Content. Kofax captures content from all kinds of sources usually via scanning & OCR. The captured information is then “released” to Alfresco, for it to be managed in an ad-hoc manner or via pre-defined business processes.
Micro Strategies MSI Profiler for Alfresco is a powerful, flexible, and easy to use application developed to place robust indexing and scanning management features into the hands of users with intuitive interfaces. The latest version of MSI Profiler represents a next generation release of a product that has continually evolved over the past five years in response to user feedback and specific industry requirements.
Scanpoint Documalis is a functional complement to the major documental projects, providing solutions to capture faxes and paper documents, indexing via Automatic Document Recognition-Automatic Document Content Reading, OCR and PDF conversion, digital sign and workflow for outsourced Alfresco, Filenet, Nuxeo , MS SharePoint …
Zia Consulting Inc The solution we use for data entry and release is something we created called Fresh Capture ( It allows users to browse a directory for PDFs (soon to be compatible with other formats), preview the document, select the document type, enter meta and release to the repo. This is an Air app and CMIS.


ABBYYForce for Box

Use Case:  The Salesforce product offering now is much more than it once was just a few short years ago with the focus primarily on their core Customer Relationship Management (CRM) Software as a Service (SaaS) capability.  The ecosystem has evolved to where there are many useful, as well as, innovative SaaS applications built using the, and other platform services that Salesforce now offers.  The ease of relative use from a development standpoint, the time to start utilizing an application as well as the decreased complexity is just a few of the reasons why the platform is so successful.  An independent third-party CustomerSat Survey in July/August 2009 had the below interesting statistics to confirm these platform benefits:

The platform encourages new innovation with easy to use development environments and this translates directly into terrific technology solutions design opportunities for hardware manufacturers, mobile developers, SaaS providers and even Enterprise customers themselves to create custom mashup applications for their precise needs.  In this particular solution we will use a similar platform concept with the exception that instead of CRM application logic and workflow, the ABBYYForce project is offering Conversion and Data Capture as a Service.  Salesforce with their support of metadata and logic, in conjunction with ABBYY as a Service metadata extraction technology and Box secure storage and collaboration are an ideal solution for Enterprise organizations looking for best-of-breed functionality.

An animated version of the vision:

Use case scenarios:

  1. Scanning device manufacturers and mobile developers:  Devices that are capable of capturing images is quickly becoming ubiquitous.  This includes not only all the dedicated or network-attached sheet-fed devices but also all the multifunction devices with scanning capability and especially mobile devices with smart phone cameras.  The opportunity for device manufacturers, as well as, software developers that create integrated solutions using their tools and SDK’s is tremendous to offer more of a complete solution than just the capture device itself.  Box is a perfect option because highly secure storage and effective collaboration on content is at the core as a least common denominator of customer expectations.  Additionally, Box offers many methods to integrate with their service including Box API, Box OneCloud Platform or Box Embed so there are several different options depending on requirements.
  2. Software as a Service (SaaS) providers:  SaaS solution providers are revolutionizing the way that business applications are delivered with great potential in offering their customers improved operational efficiency without the time-consuming tasks of procuring, installing and deploying traditional on-premise software.  Now organizations of all sizes can have robust, enterprise-level applications such as CRM, Enterprise Resource Planning (ERP) or Travel & Expense (T&E) Management without the typical barriers to actually begin utilizing these applications.  However many of the process workflows associated with most of these applications still involve manual data entry at some point or another.  For example, manually entering business card data into your CRM, keying-in invoice details from a received invoice into your ERP or doing the time-consuming task of entering all the line items details from an expense receipt into your T&E system.  Adding Data Capture as a Service which is a complimentary technology either embedded directly into your SaaS user interface, or as-a-service that can automatically populate index fields with relevant metadata takes improving business efficiency to the next level.  By adding this efficiency improving capability, SaaS providers can provide the tangible return on investment in reduced manual labor costs to help moving forward sales quicker and/or justify subscriptions for additional seat licenses because of the improved total cost of ownership.
  3. Enterprise customers for internal projects:  Since nearly all software as a service applications offer integration possibilities via Web Services application programmers interface (API) this makes integration over the internet much easier than in years past.  Traditionally, integration work to get two systems to communicate together was often an expensive professional services engagement that took time, money and intimate knowledge of these systems.  Standards that Web Services utilizes such as XML, HTTP or REST open up the possibilities for a dynamic group of creative and innovation software developers to integrate applications with agility like never seen before.  Savvy Enterprise customers could possibly already have the internal software development skills themselves, or can outsource projects to this new skilled set of Web Services developers.  In this use case scenario an Enterprise organization can select best-of-breed applications for their particular needs and have a developer with Web Services skills integrate, or just finely tune, applications for tight interoperability.  For example, a solution might use Apttus for their Configure/Price/Quote (QPC) Management system and Concur Travel & Expense (T&E) Management system which both already have Salesforce integration and then use the ABBYY Data Capture as a Service to integrate Data Capture capability into these applications.  And since the Enterprise realizes that their workforce, and their customers, are highly active via mobile devices they use the Box Web Services API to store the images captured directly into Box.  This way everyone interacting and collaborating on content can use any one of Box’s highly useful mobile applications.
Features Benefits
  • Pre-built templates
  • Customization easily achievable
  • Fit your specific organization needs
  • Quick adoption for better return on investment
  • Reduce outsourcing development costs
  • Agility to fit precise business requirements 

Solution Description:   ABBYYForce is the concept of a pre-built collection of ‘Custom Objects’ within Salesforce that are basically different Document Types.  For example the document types we will use are Business Card, Invoice Statements, Questionnaire and Banking Documents.  These Custom Objects are packaged together in what Salesforce calls an “App” and are given to Salesforce administrators who then can install a complete suite of different document types in minutes.  Capture is an extension of a business process so the first thing we’ll want to do is create the Custom Objects in  Once these Custom Objects are created we will then map our Data Capture index fields to the Custom Object fields.


System Requirements:

Note:  This is a software developer and/or systems integrator solution.  While many of the concepts are achievable, there is some level of software integration that will be required.

  1. Box account
  2. ABBYY technology (depending on specific requirements)
  3. Salesforce account


Configuration Steps (Complexity = Software integration required):

  1. Subscribe to ABBYY Online Services or login to access the services account
  2. Review the Configure Services menu
  3. Create Custom Objects in
  4. Create Custom Fields with Data Types in
  5. Create an input device copy and paste code
  6. Paste the code into your application
  7. Notice the new input device icon now embedded into your application
  8. Configure your back-end connectors
  9. Add new document types or create a new form
  10. Depending on your subscription services you can Create a Conversion widget or Create a Data Capture widget which has field mapping capability to map Data Capture index fields to database fields in the back-end application
  11. Reporting of all subscription services with easy renewal


User operation (Complexity = Easy):

  1. User clicks a capture icon or hyperlink to acquire an image
  2. Verify the extracted data for high accuracy
  3. After confirmation then the data is saved immediately into the back-end application


Associated screen prints on this solution:

  1.  ABBYY Online registration form

  1. Login to access online services

  1. Configure services

  1. Create Custom Objects in Salesforce

  1. Create Custom Fields with Data Types

  1. Configure input device with copy and paste code

  1. Paste code into your application

  1. Capture device icon embedded into application

  1. Configure back-end connectivity

  1. Document Type Configuration

  1. Creating a new form

  1. Create a Conversion widget

  1. Create a Data Capture widget

  1. Reporting


  1. User clicks the icon to acquire an image

  1. Verify extracted data

  1. Once confirmed the data is stored directly into the back-end system



This is a fairly sophisticated integration that can be achieved rather easily using modern platform development tools and various cloud services.  Do you have any experience using platform services?  Is this type of as-a-service for Conversion and Data Capture of interest to you?  Do you have a specific use case scenario to share?  We’d love to hear from you.

Systems of Record and High Collaboration for Box


Use Case:  Enterprise Content Management (ECM) systems, or probably a better description, Systems of Record have a long heritage of providing niche functionality that allows organizations to effective access via search, securely retain and destroy with retention schedules and enforce business policy with governance rules.  Your organization desires to utilize all the benefits of the System of Record, yet you also want to encourage collaboration among your users because you know there are many business processes that involve sharing of information on a particular piece of content before it needs to officially enter the ECM as a “record”.  The best solution to offer both a solid ECM solution as well as a highly collaborative environment is to use Box outside of your corporate firewall to enable users to efficiently share information, then once the content needs to enter the ECM you can either have the ECM system monitor a watched folder and bring it in, or you can have your users declare a record and push it immediately into the SOR.

Features Benefits
  • Records management, retention schedules and business policy
  • Content collaboration outside of firewall
  • Security behind and outside the network
  • Better adherence to compliance laws
  • Ease of use encourages high user adoption
  • Piece-of-mind that information is secure without exception

Solution Description:   You have done your due diligence and over the years have tuned your ECM system into a well-oiled records management machine.  While this system is operating nicely you find that there is a lot of content that is changing often, especially early in its lifecycle, and does not necessarily need to enter your System of Record until it goes through these initial rounds of changes.  Therefore, you will setup a solution where a highly collaborative environment outside of your corporate firewall using Box is established.  Then you will do one of two relatively simple integration methods to allow content from Box to easily flow into your ECM system.  You will allow the users themselves to declare a record and send the content immediately from Box into your ECM or, as a good technical architecture rule in general, you will have your ECM automatically reach-out and look into Box to review the status documents and retrieve any that may have inadvertently not been declared that should have been.

System Requirements:

Note:  This is a conceptual solution and will require some level of integration work, although could be minimal to achieve the end result.

  1. Box account
  2. ABBYY  or ABBYY service account
  3. System of Record/ECM system


Steps/Architecture (Complexity = Moderate to Involved):

  1. Create a general collaboration shared work area, or areas, in Box
  2. Then create sub-folders to mimic your existing organizational infrastructure such as Accounting, Marketing, Sales, etc. and use Box folder permissions to invite collaborators for each folder
  3. These first two steps are to create areas for collaborative work.  Now you want to create an Upload folder for finished work to be sent to your System of Record
  4. As users complete collaboration on a particular piece of content they would simply use an integrated ‘Upload to ECM’ button within their application to upload the document
  5. At this point, depending on how the solution is integrated, the solution can do one of many things utilizing ABBYY hosted services.  Often there are two options:
    1. Unattended, where once the user presses the button then the content is processed is converted and stored with no further user interaction
    2. Interactive, where the content is processed and information is extracted yet you would want to have the user verify the accuracy of the data captured
    3. For the conversion process itself, and in particular for Systems of Records integration, a popular method of conversion is Image and a corresponding XML file with the extracted index fields
    4. Now, again depending on the method of integration, there are typically two scenarios that can deliver the images and extracted results to your System of Record
      • Push:  Push, as the term would indicate, means that there has been an integrated procedure within Box where once files are sent to the Upload folder then they are immediately sent
      • Polling:  Polling means that the System of Record is actually checking the Box Upload folder at some interval to see if there are new files to import.  Each method has its pro’s and con’s, it just really depends on an organizations specific requirements


User operation (Complexity = Easy):

  1. Since the logic of the document workflow, as well as, the technical integration to get content moved between various folders will have been done by a systems integrator/software developer, the user operation is as simple as pressing a button
  2. It is important to note that while this solution is extremely easy from a user operation standpoint, one of the most important things to consider is operator training on how to utilize the system most effectively.  Especially as the number of users, departments and processes increase this creates not only great opportunity for highly efficient collaboration, it could also introduce a level of confusion that you would like to avoid


Associated screen prints on this solution:

1.  Box general collaboration area

2.  Box sub-folders and access permissions

3.  UPLOAD folder

4.  Integrated ‘Upload to ECM’ button

5.  Technical workflow transparent to the user

6.  Data quality verification

7.  XML file output results

8.  Push and Polling transfer methods

9.  Simple user experience with one button operation

We’d like to hear from you on this innovative idea.  Does our suggestion of incorporating the best qualities of traditional systems with the best qualities of ‘disrupted’ systems appeal to you?  Can you think of other mashup concepts?  We would appreciate your feedback.

Mobile with Automatic Data Capture for Box

Use Case:  Mobile devices continue to proliferate among users all over the world at an astonishing rate.  While the benefits of ‘consuming’ information and content on mobile is rather obvious these days, these devices offer other fantastic opportunities to take advantage of the technology.  Specifically, the ability to ‘capture’ from camera-enabled devices, instead of just ‘consume’, information from these devices.  Applications such as Field Service technicians capturing items such as work order signatures, accepting and processing checks for deposit or collecting invoices, all in the field, produce great efficiencies by these workers.  Or in the case of long haul truck drivers that can collect bills of lading, trip sheets, scale tickets or vehicle expense receipts and process them while still on-the-road instead of waiting to get back to a computer somewhere all helps to reduce costs and help with positive cash flow in your business process.

However, for our particular use case we will use the example of a traveling salesperson.  They travel to a customer site to finalize a deal.  A new person is introduced into the mix and this will be the main contact for all account for all matters, so they naturally introduce themselves and hand you a business card.  Additionally, they sign the contract, they also sign the non-disclosure agreement and your organization requires that they provide an authorized signature on the written proposal itself.  Great, congratulations!  But not so fast, you cannot begin the delivery process on your goods and services until all the information is entered into your corporate system.  Traditionally, you could not kick-off the delivery process until you return to the office possibly days later.  Fortunately for you, and your organization, you are innovative and have decided to supply your sales team with mobile applications tools such as ABBYY Business Card Reader for Salesforce and ABBYY FineScanner for Box.  So, all you do is snap a photo of the business card then all the fields on the on the card such as Name, Title, Company and E-mail address are automatically extracted with ABBYY Business Card Reader then, if necessary you can correct any information, and then finally you can send it directly into  For all the supporting documents such as the contract, NDA and signature on the proposal you would simply use the ABBYY FineScanner for Box mobile application to capture a collection of all these related documents then send them directly into Box.  Now, in the matter of a minutes, you have created the new contact within Salesforce and supplied all the items necessary to start the delivery process and have barely even left your customers office!

Features Benefits
  • Remote deposit capture
  • Sales enablement
  • Capturing data in real-time
  • Capture check images to immediate receive payment
  • Capture a signed contract to kick-off a delivery process
  • Better adherence to policy and compliance

Solution Description:   We will use two pre-built mobile applications for specific use cases to create an efficiency producing solution.  For the business card we will use the mobile ABBYY Business Card Reader for Salesforce.  This application allows users to take a photo of a business card and the technology performs Optical Character Recognition to extract all the business card data.  Then the user can upload directly into Salesforce.  Next, for all the other related documents to this account, we use ABBYY FineScanner for Box to capture a collection of documents without having to go through the tedious task of capture, then upload, capture then upload and so on.  For additional functionality or customization of mobile capture solutions the ABBYY Mobile Data Capture Solution (MCDS) provides for the highest-level of classification and extraction technology.

System Requirements:

Note:  This is an integrated solution using Box Embed functionality.  This solution does not involve any difficult software development efforts, rather a Salesforce administrator can just literally copy and paste a few lines of code and have this capability available nearly immediately.

  1. Box account and Box Embed integration
  2. ABBYY Business Card Reader (BCR)  for Salesforce and ABBYY FineScanner for Box … or ABBYY Mobile Data Capture Solution (MDCS)
  3. Smart phone with camera


Configuration Steps (Complexity = Simple):

  1. Install ABBYY Business Card Reader for Salesforce and ABBYY FineScanner for Box on your mobile device
  2. Configure both applications to connect to their respective destinations ( and  Include the Box Embed HTML code in your destination application (i.e. or NetSuite, etc.)
  3. For Business Card, take a photo of the business card with BCR, validate the extracted results and then save into Salesforce
  4. For the other related items, take photos all the documents using the Batch Mode function and any tags your wish then Send to Box
  5. Login into Salesforce to view the new contact and also the corresponding documents securely stored in Box


Associated screen prints on this solution:

1. Box Embed copy and paste code

2. ABBYY Business Card Reader (BCR)

3. ABBYY FineScanner for Box ‘Batch Mode’ for multiple image upload at once

4. Add tags

5. Save and send to Box

6. Business Card details saved in Salesforce and associated images stored in Box seamlessly

As always, we are interested in hearing from you.  Do you have a story to share?  Would you like to see a particular feature?  Please let us know.

Exploiting Big Data with Indexes for Box


Use Case:  In today’s business environment, more than ever, it’s simply not good enough to be average.  Organizations of all sizes have to strive to create competitive advantages, understand trends and gain better insight into operational efficiency.  One of the most useful techniques to accomplish these goals is to Exploit Big Data through analysis.  However, this is challenging due to the volume, velocity and variety of content that must be analyzed.  Image-only files are useless in data analysis.  Therefore, in order to take the all-important first step in exploiting all of your content is to apply indexes so that computer systems can properly begin to understand the information.

  1. Reporting:  Business executives are generally paid good money to make important decisions about the business and these decisions are often based on reports.  These reports are often compiled from various data sources such as spreadsheets, interviews with customers or employees and possibly other documents.  This method of gathering all this various data is not only time-consuming but it’s problematic due to the fact that the data is often presented in a inconsistent manner.  For this reason you will want to use a Big Data system such as Splunk where business executives and have instant access to sets of data from various sources that is real-time information and presented through dashboards or graphics that can clearly show trends or other information that is pertinent to the decision making process.
  2. Predictive analytics:  Historical reporting is fantastic to analyze information yet this information is typically in the past.  Imagine if you can proactively determine a trend or predict, with solid data, future events?  This is a major benefit of Big Data aggregation.  For example, given the right set of data you can probably predict where mortgage interest rates will increase or decrease in a particular geography.  You would use statistics such as current available housing inventory supply, real-time unemployment rates as well as possibly the latest transactions within a certain time period.  Also, using the same Big Data aggregation concept but for a completely different application is predictive analytics is in the field of Healthcare.  If you can feed enough Index information into a Big Data solution then healthcare providers can narrow down much quicker the proper diagnose on people with illnesses where this can enrich people’s lives.
  3. Business process improvement:  There is always room for improvement and this is especially true in the business world and the most effective way to effect positive improvement is through the visibility to business processes themselves.  Once you understand the process then you apply matrixes to these processes such as time needed to complete a task or steps needed to finish a project.  A Big Data solution such as Splunk is an ideal complement to the efficiency improving technologies such as ABBYY Data Capture with tangible return on investment through reduced labor costs associated with manual data entry and Box with highly effective collaboration where enterprise workers can get work done quickly and be overall more effective in their business activities.  Just by deploying a Big Data analysis system with Data Capture efficiency and Collaboration on mobile that is secure is absolutely one way to achieve better process improvement but just imagine all the possibilities that can be done with the data itself.  And it all starts by Exploiting Big Data with Indexes.
Features Benefits
  • Automatic indexing of relevant data
  • Full-page for complete index
  • Touch indexing for structured data extraction
  • Reduces costs associated with manual data entry
  • Ability to analyze all data sets
  • Offers ease of use for high user adoption

Solution Description:   This solution might sound gaudy and complicated but it’s actually straight-forward and logical.  There are three basic concepts which are Index Creation (ABBYY technology), Index Analysis (Splunk) and secure Image Storage (Box).  We will use several technologies to create indexes for various reasons and then we will feed our Big Data system all these indexes so that this software can do what it does best.  The Big Data system allows administrators to easily aggregate all this data and then create dashboards, reports and other useful business intelligence tools.  So the process is quite logical:  Capture indexes for all sources including existing databases, paper documents and, of course, images and send all these indexes to Big Data.  Then send the images to Box for safe storage, easy access and effective collaboration.


System Requirements:

Note:  This is a software developer and systems integrator solution.  We are using Splunk as our Big Data aggregator in this solution because it is so easy to configure, yet extremely effective.  Splunk can only perform well when you can provide lots of “Index” information.  As seen in this graphic, “Index” is at the core for Big Data to even begin analyzing different data sets.

  1. Box account
  2. ABBYY FlexiCapture for Automatic Data Capture
  3. ABBYY Recognition Server for Full-Page recognition
  4. ABBYY TouchTo for touch indexing
  5. Splunk Big Data software (free download)


Configuration Steps (Complexity = Moderate to Involved):

  1. Start Splunk and review choose Add data
  2. Depending on the output type and format of indexes select the proper Splunk Add Data function
  3. Now connect Splunk to your data source(s)
    1. For example, maybe Recognition Service you might choose ‘From files or directories’ and as an option Preview data before indexing
    2. …and for FlexiCapture you might choose the ‘any other data…’ then ‘Consume data from databases’ because you output to a SQL database directly
    3. …and for TouchTo you might choose the ‘a file or directory of files
  4. After connecting all the index data sources to Splunk it is advisable to review the Splunk Manager options to familiarize yourself with all the various settings and configurations available
  5. Now that you have configured Splunk to utilize Indexes from your various Data Capture and Conversion sources, you will want to gather information contained within Box.  To do this a software developer would utilize the Box API (Application Programming Interface) to import data such as tags, get comments or get file info
  6. A complete list of all the Splunk Indexes can be viewed in Manager
  7. Once all the indexes have been aggregated within Splunk then organizations can truly realize the benefits of Big Data with detailed reporting, predictive analytics and/or improved business process via simple visual tools such as dashboards


Associated screen prints on this solution:

1.  Splunk architecture with Index at the core

2.  Start Splunk

3.  Add data

4.  Splunk add From files or directories

5.  Data preview

6.  Any other data…

7.  Consume data from databases

8.  Splunk add A file or directory of files

9.  Splunk Manager

10.  Splunk Indexes Manager

11.  Splunk dashboard

What do you think?  “Big Data” is still a relatively new idea and many use cases are just coming to light.  How can you imagine using Big Data?  The possibilities to innovate in this area are tremendous, do you have a story to tell?

High-Volume Scanning and Conversion with WebDAV upload for Box

Use Case:  You are an organization that has been in business for a few years.  A strategic decision has been made by the organization that instead of maintaining your own internal technical infrastructure with costly procurement costs, expensive maintenance and difficulty to keep on top of security issues, you just want to sign-up for Box Enterprise and focus on your businesses priorities instead of technical infrastructure maintenance.  However, over the years you have acquired and are storing paper documents in file drawers and cabinets which makes these documents difficult to share, hard to find (or impossible if misfiled or someone else has the physical document) and poses a security risk because you cannot adequately apply security policy to the documents.

Features Benefits
  • Reduce technical complexity
  • Extremely easy user operation
  • Automatic folder structure creation and standard file naming conventions
  • Simple to configure in minutes
  • Encourages high adoption rate among users
  • Enables openness and compatibility among different business systems

Solution Description:   As much as we would like to have all information begin as electronic content, the fact of the matter is that legacy paper persists and all content is not ‘born electronic’.  When we want to convert, and then archive, a large volume of paper documents then we will want to use a high-speed production document scanner in conjunction with utilizing OCR technology to create fully Searchable PDF files and then store them in Box.  Your solution is to utilize WebDAV which is one of the easiest methods of exchanging content between systems over the internet.  It’s basically like a mapped drive for those familiar with the concept; and Box supports this method of integration.

System Requirements:

Note:  This is about the easiest method for transferring files to Box that can possibly exist.  The solution allows a user to scan a large batch of paper documents, process them with recognition technology and then upload via a mapped-drive method.  Box is simply a shared drive and, therefore it’s easily accessible from any internet connected device that can map drives via the cloud.

  1. Box account
  2. Internet connection
  3. ABBYY FineReader for full-text and file conversion or ABBYY FlexiCapture for forms processing data capture


Configuration Steps (Complexity = Simple):

  1. From the computer in which ABBYY software was installed go to My Computer and right-click, then select Map network drive…
  2. In the Map network drive dialog box choose an open drive letter for your mapped Drive then input ‘’ as the Folder.  Also, check the ‘Reconnect at logon’ if you would like to have this drive available whenever the user logs on to the computer
  3. It might take a few moments while Attempting to connect to Box via WebDAV
  4. When prompted for Windows Security login credentials, simply use your existing Box Username and Password
  5. You will now be logged into your Box account via a mapped drive
  6. Now you will want to configure your ABBYY software application to Export Destinations to this Box WebDAV location.  Go to your ABBYY Export Settings and Add… a new location
  7.  Within the Export Destination Wizard you can select the Type of export, choose Export to data files and check the Save document images option
  8. Configure the Export Path to be your Box WebDAV folder location and construct the output folder and filenaming convention however you wish.  You will note the is tremendous flexibility in the options available to you


User operation (Complexity = Easy):

  1. Start your scanning software application
  2. Select the scanner you wish to use and Scan Pages
  3. Scan a batch of documents, verify the extracted results and then save to your Box WebDAV folder
  4. The resulting automatically created folders and files based on Automatic Data Capture extraction technology will be immediately available within Box


Associated screen prints on this solution:

  1. My Computer, right-click to Map network drive…

  1. Map Network Drive
  1. Attempting to connect…

  1. Windows Security

  1. Box WebDAV successful

  1. Add Export Destination

7. Export Destination Wizard

    8. Filenaming conventions


User Experience:

  1. Start scanning application

  1. Scan Pages

  1. Scan a Batch

  1. Folder and Filename results available immediately in Box

What do you think?  Does WebDAV help make collaboration easier?  This is just one way to accomplish tasks so we would love to hear from you.  Make a comment, join our notification lists or check back soon for updates.