Frequently Asked Questions

Frequently asked questions about Aleph Archives web archiving solutions

General

What is Aleph Archives web archiving service?

Aleph Archives is the industry’s leading web content collection and preservation platform considered the Gold Standard in legal defensibility and trusted by Fortune 500 and Am Law 100 companies alike. The Aleph Archives platform enables organizations to collect, preserve, and analyze web content for litigation, internal audits and forensic investigations.

What other products form part of the Aleph Archives Platform?

Our web archiving platform is the core offering in the Aleph Archives Platform. Over time, we will be releasing new product categories that include solutions for proactively controlling the escalating costs and risks associated with internal and customer-facing web content and for organizations looking to harness, analyze and act on intelligence from the web (the world’s largest source of unstructured data).

Can I get an on-premise version of Aleph Archives Solutions?

Absolutely. Enterprise customers can purchase an On-Premises version of Aleph Archives Solutions through our sales organization.

Is Aleph Archives a software or a service?

Both. Aleph Archives provides options to use the software as a service (SaaS) and under a license (on-premise).

How does Aleph Archives work?

Essentially, Aleph Archives’s software visits a website and collects what a person would see, and we store the content exactly as it was delivered from the target site. We navigate through the site, so we get all of the web pages and related content you need, including text and metadata from each page. We also make a PDF and PNG of each page and store that alongside the native web content.

Once the collection is complete, we make a working replica (links work, videos play, etc.) of the site available so you can see how the site performed when it was live. We also create exports using the PDFs of each page and the native content.

How big is the website to capture?

Not sure how big the site is that you want to capture? No problem. Aleph Archives uses a number of tools to provide our clients accurate page counts, and our experience across thousands of web capture projects helps clients make sure they’re getting the correct capture scope in place.

Can Aleph Archives capture sophisticated content like videos, pop-ups, and interactive elements?

The short answer is yes, Aleph Archives can capture virtually anything you can see in a browser. Want to give us a test? Show us the hardest, most complex content on your site. We’ll show you how Aleph Archives’s technology can give you the most complete, accurate and defensible captures available.

What constitutes difficult-to-capture content?

There is a huge amount of web technology that makes it easy for people to use websites but also makes it very difficult for many capture tools (except Aleph Archives, of course) to harvest. Essentially, it is content that requires interaction with a web page; think drop-down list selections, mouse-overs, pop-ups, multimedia, …etc.

Will the site look like it did originally?

Yes. Interactive elements, like mouse-overs, image carousels, drop-down lists and pop-ups, will play back like the original, as will all the links on the site, including video and other multimedia content.

Is it just a screenshot?

No, it is much more than a screenshot. With Aleph Archives, any captured web content looks and works like it did when it was live on the web. Screenshots give you no interactivity, and they miss critical content on modern web pages.

Technical Details

What is ISO 28500?

ISO 28500 is the standard for web content collection and preservation. It was designed by an international body of experts in digital preservation. It specifies a methodology for collection and it specifies a storage format called a WARC file.

Why are the ISO 28500 standard and WARC files relevant to my organization?

No matter what your reason for capturing web content, there are two things you don’t want:

  1. You don’t want to be trapped in a proprietary format that works with only one vendor
  2. You don’t want your captured web content to be un-viewable in the future

ISO 28500 WARCs make sure you avoid both of those issues. Virtually all other web capture methods are susceptible to those problems, and that’s what Aleph Archives wants to avoid for our clients.

What is a WARC?

A WARC file is an industry-standard format for storing collected web content and associated data. A WARC file is a container that provides structure to the data for processing, indexing and access. More importantly, a WARC file will preserve original web content exactly as it was delivered from the target site. It contains all of the metadata that allows a forensic examiner to verify the integrity of captured web content.

What is native format web content?

Native format web content is the unaltered format in which the web content was originally delivered to a browser. It includes all of the components that make up a web page: HTML, CSS, JavaScript, images, text, etc. It is critical for authentication and forensics purposes when collecting web content.

How many hops? How many levels deep in a site can you go?

You control the number of hops and how many levels deep you want to go. Aleph Archives generally recommends +1 hop. For example, if a Facebook profile has links in posts or comments to sites outside of Facebook, Aleph Archives will follow each link and capture the resulting web page, but no links from that web page.

What if the site has links to third-party content, like a link inside a tweet or post?

Aleph Archives generally captures sites including what we call +1 hop, meaning we will follow all links outside the target site to one hop away, and then stop. You control the number of hops, and, therefore, how much content you want to include in the capture.

When Aleph Archives captures my website, will it impact the performance of the website?

In general, no. Aleph Archives looks like a user on your website, so we impact the performance of the site like any other user would. Aleph Archives’s professional services team works with our clients to make sure we have the smallest footprint.

Can I schedule captures for a particular time of day?

Yes. Many of our clients opt to have Aleph Archives run in overnight hours when usage of the website is lowest.

We have analytics packages on our website. Will Aleph Archives impact site analytics?

No. Aleph Archives’s professional services team uses a variety of techniques to make sure Aleph Archives isn’t impacting our clients’ site analytics at all. When you’ve done as many website captures as we have, this kind of attention to detail comes naturally.

Can you capture web content behind a login?

Yes. It takes some serious sophistication to perform accurate, defensible captures behind a login, and the good news is that Aleph Archives does logged in captures all the time.

The website I need to capture personalizes content based on the location of the user. Can Aleph Archives capture the site as if it were in a particular location?

Yes. Aleph Archives can trigger geolocation content so you can see how the site appeared to someone in a specific location.

What about other types of personalization of A/B site versions?

Aleph Archives uses a variety of methods to trigger all kinds of personalization characteristics of websites, including things like browser history triggers, preferences and A/B site direction.

Can I use this with Relativity?

Yes. Aleph Archives provides a Relativity .DAT file as a standard part of every capture. You can load content into virtually any eDiscovery review platform.

Compliance

Is Aleph Archives compliant with SEC, FINRA, FCA and other regulatory requirements for WORM storage?

Yes. Aleph Archives stores all captured web content in WORM storage.

Will Aleph Archives provide letters of attestation as a books and records custodian?

Yes. These letters are standard parts of the Aleph Archives agreement.

Will Aleph Archives provide affidavits/declarations, and can Aleph Archives serve as an expert witness, if required?

Yes. Aleph Archives has provided dozens of affidavits and declarations, and been called on to testify as an expert on numerous occasions.

How long will Aleph Archives retain data?

Aleph Archives stores content as long as you need it. Our clients set the retention schedule to meet regulatory requirements or litigation needs. For many clients in the financial services industry, the retention period is seven years.

Yes. Aleph Archives supports retention and records management, including exceptions for legal holds, plus reporting and notifications, such as upcoming records due for disposition.

How do I produce captured web content to regulators, investigators or opposing counsel?

You have a number of options, including producing exported PDFs, which are always instantly available with Aleph Archives, to e-discovery industry standard load files and a variety of native format production options.

How do I view captured content if I don’t have Relativity?

Aleph Archives provides a number of viewing options, including native format, where you can view the site just as it appeared when it was live online, plus a variety of other export formats, including offline working replicas of the captured sites.

What kind of reporting metrics can you send me?

Aleph Archives provides a variety of analytics and reports to help customers create a detailed picture of their web portfolios. You can monitor changes to sites, including text and image changes, plus quickly pinpoint critical items within your web content, like external links or forms. Additionally, you’ll receive a full suite of reports for compliance support.

Customization

Is it possible to customize captures, and if so, to what extent?

Yes. Captures can be customized in a variety of ways. The methods fall into these general categories:

  1. Frequency: You can control how often the site is captured (for instance, daily, weekly, monthly, etc.). Additionally, you can trigger captures based on events, such as changes to a website, and then launch captures when a change or new page is detected.
  2. Scope: Customization around scope typically involves inclusion or exclusion of:
    1. File types: Some customers opt to exclude video or PDFs because they have other systems of record for those content types.
    2. Third-party links: You can direct Aleph Archives to follow links that go outside the target domain.
  3. URLs within the target site: You can use pattern matching to exclude URLs within a site from a capture.
  4. Interactions: Aleph Archives can customize the capture to interact with the target site just like a person would.
  5. Personalization: Aleph Archives customizes settings to trigger the behavior of a site for things like device, location or other personalization features.
  6. Crawl speed: This enables you to determine the time window within which you would like the crawl to complete.
  7. Analytics: We also customize the crawl so that we do not trigger reporting or site analytics.

How do I add new sites or change the scope?

It’s easy. You can add new sites through the Aleph Archives app, or our support team can add the sites for you. We do the heavy lifting for you.

Billing & Users

How does Aleph Archives invoice for social media users?

For social media, we invoice for every profile or page, across all networks.

What should I do if the list of internal domains is incorrect?

To ensure accurate billing, Aleph Archives needs to have a list of all internal domains. Even domains that aren’t archived but are internal to your company are required to help the system calculate usage. In some cases, domains our system discovered may have been added to your domains dashboard.

If the domains listed are incorrect, please contact Aleph Archives Support.

How many users can I have?

As many as you want, Aleph Archives doesn’t charge for users.

Do you have access control levels - can I control what users see?

Yes. And you can control the content that’s available to each user. You don’t need a support call to Aleph Archives to add users or manage permissions.

Can I manage my users?

Yes. You can manage users through Aleph Archives’s admin features in the app. You don’t need a support call to Aleph Archives to add users or manage permissions. Plus, ask Aleph Archives about LDAP and SAML integration.

My company uses active directory. Can Aleph Archives integrate with AD so we can manage users from a single source?

Yes. Aleph Archives provides LDAP and SAML integration to support easier user management for many customers.

Training & Support

Do you provide training?

Yes, although you don’t need much training at all to use Aleph Archives. We do the heavy lifting behind the scenes so you get a clean, easy-to-use app.

How much training does it take?

For most users, Aleph Archives requires little to no training. Of course, admin and engineering users will get much more training.

Do I have to install any software?

No, not if you’re using Aleph Archives as a service. You can view content using a browser (Chrome, Firefox or Safari – we don’t recommend Internet Explorer). You can also download our viewer app, which many customers find easier.

If you’re using the on-premise instance of Aleph Archives, for you, Aleph Archives is software running on your organization’s network (but only if you’re using Aleph Archives on-premise).

See the Most Complete Web Archives in Action

Schedule a 15-minute demo to discover how Aleph Archives automates regulatory web archiving for your organisation.

See the Most Complete Web Archives in Action