Transcription, Research and Data Services
- Services and Prices
- Tax Compliance
Services and Prices
We offer manuscript transcription, XML markup, OCR correction, and data processing and cleaning. See the headings below for more details and typical prices.
Understanding the language and structure of historical documents is an important part of our services, allowing us to do things that non-English speakers and computers can't. We can help with pilot studies, handle the bulk of work for small to medium projects, or come in towards the end of a bigger project if you need extra help to meet a deadline. We are ideally placed to do jobs that are too big to do yourself but too small to justify the expense and trouble of recruiting an employee.
We will comply with whatever transcription conventions, database structures and tagging instructions you specify. Data will be delivered at agreed milestones.
We always charge a fixed fee for a whole contract. We can negotiate prices according to your project's budget and timetable, and the difficulty of the work to be done. If the work takes longer than expected, we will bear the loss and there will be no risk to your institution. If your institution is outside the UK, we can accept payment in US dollars or other currencies. The price will be fixed in your currency so you don't have to worry about exchange rates.
Transcription and Data Entry
We offer Careful Expert Single Keying for texts that are not suitable for Outsourced Blind Double Keying or automated recognition software.
- Transcription: we can create full text transcripts as plain text or word processor files.
- Data entry: we can transcribe documents as structured data using spreadsheets or databases. Understanding the source documents helps us to enter text into the correct fields, even if the original text is semi-structured or unstructured. We have particular experience of transcribing parish registers and census returns for demographic history projects.
- Data extraction: we can manually index or calendar relevant facts from structured or unstructured documents that do not require full transcripts, including formulaic Latin documents.
Prices and timescales for full transcripts vary according to the number of words per page, image quality, and difficulty of handwriting. A typical contract would charge around £4.50 or US$6 per page and deliver 250 pages per month, assuming an average of less than 300 words per page. Before we can give an accurate quote, we need to see all of the page images, unless we are already familiar with that type of document. Prerogative Court of Canterbury wills in PROB 11 contain a very large amount of text and will cost around £9 or US$12 per full page.
We usually prefer clients to supply document images and get any necessary copyright clearance, although we can visit British archives and copy records in some cases. It's best if you can supply sharp, high-resolution, colour images but we can usually work with whatever you've got. We are able to transcribe from images that are not suitable for automated processes, and can sometimes enhance images to make the text more legible.
We can mark up transcripts with TEI or any other XML tags. For example:
- mark up documents with XML as we transcribe them.
- add XML markup to existing text transcripts.
- add or change TEI XML tags in Text Creation Partnership texts to help you repurpose them for your project.
Basic XML tags can be added during transcription for little or no extra cost. Prices for more detailed markup or adding markup to an existing transcript vary according to how much markup is required and how far it can be automated. We will need detailed specifications and examples before quoting a price.
We can use manual and automated checks to improve the quality of printed text automatically transcribed by Optical Character Recognition software. These checks are also part of our process for checking our own transcription work, and can be applied to manuscript transcripts.
- text mining to trap mis-spelt words.
- smooth reading to trap dictionary words that don't make sense in context.
- targeted A-B checks for high-value data such as names and dates.
For standard modern spellings we can typically correct 5,000 words for £18 or US $24, and deliver up to 400,000 words per month. The cost may be higher if there is a high error rate, as manually correcting errors will take more time.
Data Processing and Cleaning
We can use a combination of automatic and manual techniques to extract, process and clean up existing data for re-use in your project, or to check and clean data generated by your project. We have particular experience of cleaning CSV files, extracting data from the UK National Archives Discovery catalogue, correcting data created by genealogy paysites, and checking place name data. Data can be restructured and exported in many different formats, including Mediawiki XML for import as wiki pages, or can be returned in the original format.
Prices are hard to predict but once we know your requirements we might be able to find surprisingly easy and cost-effective ways of extracting data. For cleaning CSV files, we need to see the whole data set in order to quote a price because the time and cost depend more on the number of unique values to be checked than on the total number of records.
We can check, correct and give feedback on transcription done by others. For example:
- checking and correcting outsourced or in-house transcripts.
- assessing other contractors' bids for big contracts that we are not in a position to undertake.
- testing experimental automated transcription software.
- proofreading or providing sample transcripts for research into the accuracy of transcription methods.
We can offer more detailed and meaningful measures of transcription accuracy than automatically calculated percentages for character and word accuracy.
Line-by-line proofing typically costs at least half as much as manual transcription, but will cost more than half if there is a high error rate. We can also apply our OCR correction techniques to texts that don't need to be perfect. This will be cheaper than line-by-line proofing but the cost for early-modern texts that retain original orthography will still be higher than correcting OCR of modern printed texts.
We are not currently offering book indexing services.
US Tax Law
If you are based in the US, we are exempt from withholding because the work will be done outside the US. We can supply a statement to this effect and/or a W-8BEN-E form before commencing a contract if necessary.
UK Tax Law
We do not charge VAT because we are under the threshold for compulsory registration and our services are not classed as e-services.
Our contracts and working practices cannot be classed as disguised employment by IR35 rules because:
- We supply all equipment and software that we need to do the work. Your institution avoids extra costs, licensing issues, and health and safety liabilities.
- We do not charge by the hour. Instead, we will offer a fixed price for a whole contract. This makes transcription costs more predictable, helping you to plan your projects and stay within budget.
- If you find errors, corrections and further checks will be provided free of charge, giving us an incentive to transcribe accurately and keeping costs predictable.
- A project plan, including milestones to be delivered, will be written into the contract. We have total control over how we achieve the agreed outcome. The contract will specify a Supplier's Manager (usually Gavin Robinson) who will manage the work, ensure quality and liaise with you, but will not be required to do all of the work personally. This saves you staff hours that would otherwise be spent on managing transcribers, leaving academic staff free to research and teach, and students free to study.
- No mutuality of obligation: we will complete the tasks specified for the agreed price and nothing more.
- We will compensate your institution for losses caused by our negligence.
- We are at financial risk. Your institution benefits by passing risks on to us.
We will supply contracts written by our solicitor to guarantee that these terms will be properly enforced.
From April 2017, UK clients in the public sector (including universities) are responsible for assessing whether a contract is caught by IR35 and deducting National Insurance if necessary. Our contracts cannot be caught by IR35 because of the reasons stated above. We will give your administrators all necessary information to prove that we are not caught by IR35. We will not tolerate any illegal deductions from our fees.