Noise-Resistant Mobile PIN Authentication against Shoulder-Surfing and Spoofing
← Back to RepositoryWe present I-PIN, the first PIN authentication system that exploits structure-borne propagation acoustics to achieve robust resistance against visual eavesdropping, replay attacks, and environmental noise. The core innovation lies in modeling each PIN tap as a Location-coupled Acoustic Fingerprint (LocAF) that jointly encodes the finger's anatomical traits and the keypad's structural attributes, with path-dependent frequency suppression during structural propagation naturally amplifying fine-grained biometric distinctions.
By exploiting propagation-enhanced acoustic biometrics and introducing a Log-Energy Scaling Ratio (LESR) to model frequency attenuation, we design a novel LESR–wavelet denoising method that preserves fine-grained identity features while resisting noise and replay attacks. A deep learning framework combining LESR-enhanced PANNs with contrastive learning further disentangles user identity from behavioral and environmental variations.
I-PIN illustration: I-PIN authenticates users via tap-induced acoustic biometrics, effectively preventing PIN leakage from shoulder-surfing and replay attacks.
I-PIN Application Demo: Demonstrating the acoustic biometric-based PIN authentication system in real-world usage scenarios
This section demonstrates the core LocAF (Location-specific Acoustic Fingerprint) signals captured from real user interactions. We present data from 3 users tapping 2 different PIN digits (0, 2), with each digit repeated 2 times to show consistency and variability.
The following table presents the complete dataset of LocAF signals captured from our dual-microphone array system. Each row represents a specific user-PIN-attempt combination, with corresponding visualizations for time-domain analysis and frequency-domain CWT spectrograms.
Schematic of LocAF generation and propagation in a smartphone. A fingertip tap on each PIN keypad excites a LocAF signal that propagates through distinct structure-borne paths to the device's built-in top and bottom microphones.
PIN | User | Attempt | Raw Audio | Time Domain | Top Mic CWT Spectrogram |
Bottom Mic CWT Spectrogram |
---|---|---|---|---|---|---|
0 | u1 | c1 | ![]() |
![]() |
![]() |
|
0 | u1 | c2 | ![]() |
![]() |
![]() |
|
0 | u2 | c1 | ![]() |
![]() |
![]() |
|
0 | u2 | c2 | ![]() |
![]() |
![]() |
|
0 | u3 | c1 | ![]() |
![]() |
![]() |
|
0 | u3 | c2 | ![]() |
![]() |
![]() |
|
2 | u1 | c1 | ![]() |
![]() |
![]() |
|
2 | u1 | c2 | ![]() |
![]() |
![]() |
|
2 | u2 | c1 | ![]() |
![]() |
![]() |
|
2 | u2 | c2 | ![]() |
![]() |
![]() |
|
2 | u3 | c1 | ![]() |
![]() |
![]() |
|
2 | u3 | c2 | ![]() |
![]() |
![]() |
Each data entry includes four key components:
This section presents the Log-Energy Scaling Ratio (LESR) calculations for the LocAF data shown above, demonstrating how our novel LESR model extracts user-specific biometric features from the acoustic propagation patterns.
The LESR exploits asymmetric propagation paths between top and bottom microphones:
$$\Delta LESR_k(t) = \ln\left(\frac{R_{top}(k,t)}{R_{bot}(k,t)}\right)$$
Where $R_{top}(k,t)$ and $R_{bot}(k,t)$ represent the energy in frequency subband $k$ at time $t$ for top and bottom microphones respectively.
The LESR model captures user-specific biometric signatures through path-dependent frequency attenuation. Key properties include:
Analysis shows LESR consistency across multiple taps of the same digit by the same user, demonstrating the reliability of our biometric approach. The stability is quantified through correlation coefficients and temporal variance measurements.
The following table presents the complete dataset of LESR analysis results for all LocAF signals. Each row represents a specific user-PIN-attempt combination, with corresponding visualizations showing the LESR heatmap (frequency-time analysis) and detailed LESR time series.
LESR Analysis Parameters: 5ms window, 1ms hop, 16 frequency bands, 32kHz sampling rate
PIN | User | Attempt | LESR Heatmap (Frequency-Time Analysis) |
LESR Time Series (Detailed View) |
---|---|---|---|---|
0 | u1 | c1 | ![]() |
![]() |
0 | u1 | c2 | ![]() |
![]() |
0 | u2 | c1 | ![]() |
![]() |
0 | u2 | c2 | ![]() |
![]() |
0 | u3 | c1 | ![]() |
![]() |
0 | u3 | c2 | ![]() |
![]() |
2 | u1 | c1 | ![]() |
![]() |
2 | u1 | c2 | ![]() |
![]() |
2 | u2 | c1 | ![]() |
![]() |
2 | u2 | c2 | ![]() |
![]() |
2 | u3 | c1 | ![]() |
![]() |
2 | u3 | c2 | ![]() |
![]() |
LESR Visualization: Two types of analysis are shown: (1) LESR Heatmap provides frequency-time analysis showing LESR values across different frequency bands over time, and (2) LESR Time Series provides detailed temporal analysis with moving average. Click images to view in full resolution.
The I-PIN system operates through a three-stage pipeline that transforms raw acoustic signals into reliable biometric authentication decisions.
System overview of I-PIN. The pipeline consists of three main components: (i) LESR-based Pre-processing, (ii) Identity Feature Extraction and Disentanglement, and (iii) Authentication with Multiple LocAF Samples.
Leverages on-device acoustic field to suppress noise and extract reliable LocAF segments:
PANNs encoder extracts fine-grained physiological features, contrastive learning disentangles identity from behavioral variations:
Aggregates multiple LocAF samples during PIN entry for robust user authentication:
Each PIN tap generates unique LocAF signals encoding finger anatomical traits and keypad structural attributes.
LESR exploits asymmetric propagation paths with key properties: Position-dependent, behavior-resilient, and noise-aware (environmental noise → $\Delta LESR \approx 0$).
Self-adaptive algorithm: Multi-resolution SWT analysis → Band alignment → Stability scoring (MAD) → Selective reconstruction. Achieves 59.6 dB SNR improvement vs. 33.3 dB best baseline.
Hybrid architecture combines PANNs fine-grained acoustic features (2048-dim) with LESR structure-propagation biometric features. Contrastive learning with triplet loss + binary cross-entropy disentangles identity from behavioral variations.
The following analysis demonstrates I-PIN's robust performance across various temporal conditions, environmental scenarios, user postures, and force levels.
Performance consistency over time, showing stable authentication rates across different time periods.
Authentication performance across different environmental conditions including mall, office, subway, and taxi scenarios.
Performance analysis across different user interaction postures: tabletop, palm, and grip positions.
Authentication accuracy across different tap force levels: hard, medium, and light pressure.
Environment | ASR (%) | FPR (%) | F1-Score (%) | EER (%) |
---|---|---|---|---|
Mall | 100.0 | 0.0 | 100.0 | 0.0 |
Office | 100.0 | 1.0 | 99.5 | 0.5 |
Subway | 99.8 | 1.4 | 99.2 | 1.3 |
Taxi | 98.2 | 0.4 | 99.0 | 0.2 |
Hard Tap | 99.8 | 0.4 | 99.7 | 0.2 |
Light Tap | 99.4 | 0.8 | 99.3 | 0.7 |
Component | Metric | I-PIN Value | Best Baseline |
---|---|---|---|
LESR Denoiser | SNR Improvement | 59.6 dB | 33.3 dB |
Feature Quality | Correlation | 99.99% | 99.98% |
Multi-LocAF | ASR Improvement | 99.4% | 92.6% (single) |
I-PIN provides comprehensive defense against multiple attack vectors through its propagation-enhanced acoustic biometric approach:
Attack Vector | Traditional PIN | I-PIN Defense Mechanism | Effectiveness |
---|---|---|---|
Shoulder Surfing | ❌ Vulnerable | Visual-independent biometric authentication | 100% resistant |
Zero-effort Attack | ❌ Vulnerable | User-specific LocAF signatures | FPR: 0.06% |
Impersonation | ❌ Vulnerable | Anatomical trait encoding in LocAF | FPR: $4.6 \times 10^{-4}$ |
Replay Attacks | ❌ Vulnerable | Structure-borne vs. airborne detection | FPR: 0% |
Acoustic Eavesdropping | ❌ Vulnerable | On-device structure-borne signal isolation | 99.1% resistant |
Environmental Noise | ❌ Affected | LESR-wavelet denoising (59.6 dB SNR) | Robust operation |