TL;DR
August 2, 2026 is the EU AI Act's biggest enforcement milestone: high-risk system requirements become enforceable, transparency obligations apply for AI-generated content, and the European Commission's supervision and enforcement powers against general-purpose AI (GPAI) model providers come into force. Penalties for prohibited practices reach €35 million or 7% of global annual turnover. With ~3 months to the deadline, this is the time to lock in your compliance documentation: training data summaries, technical documentation, transparency notices, conformity assessments, and provider/deployer agreements. Save it all as date-stamped PDFs with Convert: Web to PDF and Convert: Anything to PDF. For tracking guidance from the AI Office and national authorities, ScrapeMaster helps build a structured intake.
What Becomes Enforceable on August 2, 2026
The EU AI Act has been phased in over multiple dates. August 2, 2026 is the date that triggers several compliance-intensive provisions:
Annex III high-risk AI system requirements. Includes systems used in:
- Biometric identification and categorization
- Critical infrastructure operation
- Education and vocational training
- Employment, workers management, and access to self-employment
- Access to and enjoyment of essential private services and public benefits
- Law enforcement
- Migration, asylum, and border control management
- Administration of justice and democratic processes
Operators of high-risk systems must have:
- Risk management systems
- Data governance and quality controls
- Technical documentation
- Record-keeping and logs
- Transparency to users
- Human oversight
- Accuracy, robustness, cybersecurity controls
Transparency obligations for AI-generated content. Including labeling synthetic content (text, audio, images, video) and disclosing AI use in specified contexts.
Active enforcement powers. While GPAI providers have had obligations since August 2, 2025, the European Commission's enforcement powers — including the ability to request documentation, conduct evaluations, request compliance measures, and impose fines — come fully into force on August 2, 2026.
Penalty exposure. Fines for prohibited AI practices: up to €35 million or 7% of global annual turnover (whichever is higher). For other violations, up to €15 million or 3% of turnover. For supplying incorrect information, up to €7.5 million or 1%.
Who's Affected
The EU AI Act applies broadly:
Providers of AI systems placed on the EU market (regardless of where the provider is established).
Deployers of AI systems in the EU.
Importers and distributors of AI systems into the EU.
Product manufacturers using AI in EU-marketed products.
General-purpose AI model providers with significant impact (e.g., large foundation models).
Non-EU companies that serve EU users are in scope. The extraterritorial reach is similar to GDPR.
Training Data and Copyright Provisions
A key 2026 provision: every provider of a general-purpose AI model must publish a public summary of the datasets used for training. The summary gives regulators and users visibility into how AI models learn.
For web scraping and AI training:
- Web scraping for AI training is no longer a gray area in the EU
- Providers must check whether a data source has a copyright reservation
- Providers must exclude or license content where reservations exist
- Public summaries must describe data sources
This creates a documentation burden for providers — and an evidence opportunity for content creators who reserve their content from AI training.
What to Document and Save Before August 2
If you're a provider or deployer in scope, build your compliance binder now:
A. Risk Management Documentation
- Risk identification process
- Risk evaluation methodology
- Risk mitigation measures
- Iterative risk monitoring
- Documentation of residual risks
B. Data Governance
- Training dataset inventories
- Validation dataset records
- Testing dataset records
- Data quality and bias assessment
- Statistical properties documentation
- Sources and provenance records
C. Technical Documentation
- System description (purpose, version, performance)
- Detailed system architecture
- Component descriptions
- Pre-trained foundation model details (if applicable)
- Training methodology
- Validation and testing methods
- Performance metrics
- Limitations and assumptions
D. Logs and Records
- Automatic logs of operation
- Performance logs
- Incident logs
- Decision logs (where required)
E. Transparency Documentation
- User instructions and interface design
- Explanation of capabilities and limitations
- Disclosure of AI involvement (where required)
- Synthetic content labeling implementation
F. Human Oversight Documentation
- Mechanisms for human supervision
- Override capabilities
- Override training
- Documentation of human oversight in deployment
G. Conformity Assessment
- Internal control documentation
- Notified body involvement (if applicable)
- Conformity declaration
- CE marking process (where applicable)
H. Provider/Deployer Agreements
- Provider obligations passed to deployer
- Data sharing for compliance
- Incident notification obligations
- Termination provisions
I. Post-Market Monitoring
- Plan for monitoring after deployment
- Performance evaluation in production
- Incident reporting plan
- Update and re-assessment procedures
Convert: Anything to PDF handles the mixed-format internal documents — Word, Excel, screenshots, emails — that compliance binders contain. Local processing means sensitive technical documentation doesn't go to a cloud service.
For online regulator pages (the Commission, AI Office, national authority guidance), Convert: Web to PDF captures clean date-stamped copies.
Categorizing Your AI System
Step 1 of compliance: figure out which category your AI system falls into.
| Category | Example | Obligations |
|---|---|---|
| Prohibited | Social scoring, untargeted facial scraping, certain biometric categorization | Banned outright |
| High-risk (Annex III) | Employment screening, credit scoring, education assessment | Comprehensive Annex III obligations |
| High-risk (product safety) | AI in medical devices, vehicles | Annex III obligations + sector regs |
| Limited risk | Chatbots, deep fakes | Transparency obligations |
| Minimal risk | Spam filters, video games | No specific obligations |
| GPAI (foundation models) | Large language models | GPAI-specific obligations + (if "systemic risk") additional |
For each AI use, classify and document the basis for classification. The classification drives which obligations apply.
Special Focus: Web Scraping and AI Training Data
The EU AI Act's interaction with web scraping has practical implications:
TDM (Text and Data Mining) opt-out compliance. Under the EU Copyright Directive, content creators can reserve their content from text and data mining for AI training. Providers must respect these reservations.
Public dataset summaries. Providers must publish summaries of training data, which lets content creators and regulators verify compliance.
Penalties. Failing to respect TDM opt-outs and using training data without proper rights is a violation.
For organizations doing AI training:
- Maintain logs of data sources and their TDM/rights status
- Implement opt-out detection (e.g., robots.txt extensions, content metadata)
- Document the basis for each major data source's inclusion
- Keep version history of dataset compositions
For content creators / publishers:
- Implement TDM opt-out signaling
- Track which AI providers' summaries are public
- Save AI providers' published dataset summaries as PDFs (they may change)
For our deeper coverage, see EU AI Act training data scraping rules and EU AI Act PDF audit trail compliance.
Comparison: EU AI Act vs. US AI Approaches
The EU AI Act is the most comprehensive AI regulation as of 2026. The US lacks an equivalent comprehensive federal law. Comparison:
| Dimension | EU AI Act (Aug 2, 2026) | US (federal + state) |
|---|---|---|
| Comprehensive AI law | Yes | No federal; some states |
| High-risk system regulation | Annex III categories | Sectoral (FTC, CFPB) |
| Foundation model rules | GPAI obligations | Voluntary commitments |
| Penalties | Up to €35M or 7% turnover | Sectoral and state varied |
| Training data transparency | Required summaries | Voluntary or via lawsuits |
| Synthetic content labeling | Required (limited risk) | State-level (e.g., CA) |
| Enforcement | Centralized (AI Office) + national | Distributed |
Companies serving both EU and US markets typically build to EU standards (which exceed US requirements) — a pattern familiar from GDPR.
What to Save From the Regulators
The European Commission, the new AI Office, and member-state authorities will publish guidance in the run-up to August 2 and after. Capture as you go:
- Commission communications
- AI Office publications and guidance
- National authority designations
- Standards from CEN/CENELEC and ISO/IEC for AI
- Sector-specific guidance (e.g., from EBA for financial services, EMA for healthcare)
- Codes of Practice for GPAI
Use Convert: Web to PDF to save the regulator pages as date-stamped PDFs. Use ScrapeMaster to maintain a structured tracking sheet of "Source | Date | Title | URL | Affects [Provider/Deployer/GPAI]" as you browse.
Frequently asked questions
What happens on August 2, 2026?
EU AI Act high-risk system requirements become enforceable, transparency obligations for AI-generated content apply, and the Commission's full enforcement powers against general-purpose AI (GPAI) providers come into force.
What are the EU AI Act's penalties?
For prohibited AI practices: up to €35 million or 7% of global annual turnover (whichever is higher). For other violations: up to €15 million or 3% of turnover. For supplying incorrect information: up to €7.5 million or 1%.
Does the EU AI Act apply to non-EU companies?
Yes, if you place AI systems on the EU market or your AI's outputs are used in the EU. The territorial scope is broad and similar to GDPR.
What is a "high-risk" AI system under the EU AI Act?
Annex III lists categories: biometric identification, critical infrastructure, education and training, employment and workers management, access to essential services, law enforcement, migration/asylum/border control, administration of justice and democratic processes. AI systems used as a safety component of regulated products are also high-risk.
What are the transparency obligations under the EU AI Act?
Limited-risk systems (like chatbots) must disclose AI involvement. AI-generated synthetic content (deep fakes, generated text/audio/image/video) must be labeled. High-risk systems must provide information for human oversight.
What documents should I save for EU AI Act compliance?
Risk management documentation, data governance records, technical documentation, automatic logs, transparency materials, human oversight documentation, conformity assessment records, provider/deployer agreements, and post-market monitoring documentation. Save as date-stamped PDFs.
What's the EU AI Act requirement for training data?
GPAI providers must publish summaries of datasets used for training. They must respect TDM (text and data mining) opt-outs from copyright holders. Failing to respect TDM opt-outs is a violation.
How do I know if my AI system is in scope?
Categorize your AI system: prohibited, high-risk, limited risk, or minimal risk (and separately, GPAI). Each category triggers different obligations. Document your categorization basis.
Is there a "safe harbor" for small businesses?
The Act has some proportionality provisions, but small businesses with high-risk AI systems still must comply with Annex III obligations. SMEs benefit from supporting measures (sandboxes, guidance), not from exemption from substantive obligations.
How does ScrapeMaster help with EU AI Act compliance?
ScrapeMaster helps you build a structured intake of regulator publications and guidance from the Commission, AI Office, and national authorities. As you browse these sites, ScrapeMaster captures source/date/title/URL/topic into a CSV — useful for compliance teams tracking guidance updates.
Bottom Line
August 2, 2026 is the EU AI Act's most consequential enforcement date. With ~3 months to go, the time for documentation is now: risk management records, technical documentation, training data summaries, transparency materials, conformity assessments, and provider/deployer agreements.
Use Convert: Web to PDF to save regulator pages and online disclosures as date-stamped PDFs. Use Convert: Anything to PDF for the mixed-format internal documents your compliance binder contains. Use ScrapeMaster to systematically track guidance from the Commission, AI Office, and national authorities as it's published.
For digesting the wave of long-form legal commentary about the August 2 deadline, CineMan AI summarizes pieces in your browser, so your compliance team can focus on the substantive guidance rather than the analysis volume.