"""
BREAKTHROUGH ANALYSIS: The "Nothing Ever Changes" Trick

After careful analysis, I believe the solution involves exploiting:

HYPOTHESIS: Multiple-call non-determinism or caching issues

Looking at the code flow:
1. img1_bytes and img2_bytes are read from zip
2. They must have identical MD5 hashes (line 101)
3. But img1 = load_gray_from_bytes(img1_bytes) 
   and img2 = load_gray_from_bytes(img2_bytes) must produce different results

THE KEY INSIGHT:
================
If img1_bytes == img2_bytes (same MD5), then load_gray_from_bytes() is being
called TWICE on effectively the SAME data!

The function is:
    def load_gray_from_bytes(data: bytes) -> Image.Image:
        with Image.open(io.BytesIO(data)) as img:
            return img.convert("L")

Since PIL Image.open() is called twice on identical bytes, the images SHOULD
be identical... UNLESS:

1. There's non-deterministic behavior in PIL (unlikely)
2. There's a bug in how the image context manager works
3. The model has non-determinism (but set_deterministic(0) is called)
4. OR... the ACTUAL trick is that the MD5 check happens AFTER classification!

WAIT! Let me re-read the order of operations...

Line 79-82: Load img1 and img2 from bytes
Line 84-99: Check pixel differences against reference
Line 101: Check MD5 match  <- THIS HAPPENS AFTER LOADING!
Line 103-118: Do classification

So the images are loaded, pixels checked, THEN bytes are compared!

BUT THAT STILL DOESN'T HELP because if bytes are identical, the loaded
images should be identical too...

ALTERNATIVE THEORY: PIL Image Object Reuse
===========================================
What if there's a bug where PIL caches or reuses Image objects?

Actually, looking more carefully at load_gray_from_bytes:
    with Image.open(io.BytesIO(data)) as img:
        return img.convert("L")

The context manager closes the image, but then returns the converted result.
This might cause issues with the underlying image data!

Let me test if PIL has any quirks with this pattern...

REAL SOLUTION HYPOTHESIS: The Title Says It All!
================================================
"Nothing Ever Changes" - What if the trick is that img1 and img2 
are literally THE SAME FILE referenced twice in the ZIP?

ZIP files can have:
- Multiple directory entries pointing to the same data
- Hard links or references to the same compressed data

So pair_00_img1.png and pair_00_img2.png could be different filenames
but pointing to the same underlying file content in the ZIP!

But wait, that would make them have identical pixels too, which would
fail the classification requirement...

FINAL INSIGHT: Stateful PIL or Torch Behavior
==============================================
What if the model or PIL has some stateful behavior that changes between
the first and second image load?

Looking at model.py:
- Line 45: set_deterministic(0) is called
- But this sets the SEED, not necessarily prevents all non-determinism

What if there's:
1. Dropout that's not properly disabled in eval mode?
2. Batch normalization statistics that update?
3. GPU non-determinism?

But the model uses model.eval() and device="cpu"...

I need to actually TEST this to understand what's happening!
Let me create a script to explore PNG tricks and PIL behavior.
"""

from PIL import Image
import io
import hashlib
import numpy as np

def test_png_parsing():
    """Test if PIL has any non-deterministic PNG parsing behavior."""
    
    # Create a simple test image
    img = Image.new('L', (28, 28), color=128)
    
    # Save it to bytes
    buf1 = io.BytesIO()
    img.save(buf1, format='PNG')
    data = buf1.getvalue()
    
    # Load it twice
    img1 = Image.open(io.BytesIO(data))
    img1_arr = np.array(img1)
    
    img2 = Image.open(io.BytesIO(data))
    img2_arr = np.array(img2)
    
    print("Test 1: Loading same PNG bytes twice")
    print(f"  Arrays equal: {np.array_equal(img1_arr, img2_arr)}")
    print(f"  MD5 same: {hashlib.md5(data).hexdigest()}")
    
    # Test with context manager (like in the code)
    def load_like_verification(data):
        with Image.open(io.BytesIO(data)) as img:
            return img.convert("L")
    
    img1 = load_like_verification(data)
    img2 = load_like_verification(data)
    
    print("\nTest 2: Loading with context manager pattern")
    print(f"  Arrays equal: {np.array_equal(np.array(img1), np.array(img2))}")
    
    # Test if returned image is still valid after context exit
    try:
        img1.load()  # Force load pixel data
        arr1 = np.array(img1)
        print(f"  img1 accessible after context: True, shape: {arr1.shape}")
    except Exception as e:
        print(f"  img1 accessible after context: False, error: {e}")

if __name__ == "__main__":
    print(__doc__)
    print("\n" + "="*80)
    print("TESTING PIL BEHAVIOR")
    print("="*80 + "\n")
    test_png_parsing()
    
    print("\n" + "="*80)
    print("CONCLUSION")
    print("="*80)
    print("""
The challenge name "nothing-ever-changes" suggests the adversarial approach is:
Creating examples where the FILE CONTENTS don't change (same MD5), but somehow
the classification changes.

Possible solutions to explore:
1. ZIP-level tricks (directory structure, file references)
2. PNG format quirks (multiple valid parsings of same bytes)  
3. PIL implementation details (context manager, lazy loading)
4. Model non-determinism (despite set_deterministic)
5. Floating-point precision differences in preprocessing

To solve this challenge, we'd need to:
1. Analyze the reference images in src/data/refs/
2. Understand what classifications are needed (config.py)
3. Find a way to create byte-identical PNGs that parse differently
4. Or find non-determinism in the model/preprocessing pipeline

This is a VERY clever challenge that exposes subtle implementation details!
""")