SEMANTICS OVER SYNTAX: A DEEP BI-LSTM FRAMEWORK FOR ROBUST PASSWORD STRENGTH ESTIMATION VIA HYBRID GROUND-TRUTH LABELING
Abstract
Alphanumeric passwords are the most common protection measure taken to defend digital identity despite the rapid use of multi-factor authentication and biometric security in the entire world. However, the effectiveness of this defense is often compromised by the human mental constraints, so they generate predictable trends that cannot be identified by standardized rule-based metrics of strength. Existing estimators tend to be based on more or less static heuristics like counting of characters or more basic measures of entropy which are not reflective of the semantic predictability exploited by the present-day cracking programs. The study gives a solid and data-intensive model of password strength classification that is based on a Bidirectional Long Short-Term Memory (Bi-LSTM) network. Our Bi-LSTM model uses sequences of characters both forward and backward direction, unlike conventional recurrent models, which can trivially address more complicated, non-linear relationships inside human-created passwords. In order to cross over the limitations of single-metric benchmarking, we come up with a new hybrid labeling approach that combines probabilistic estimates of crack-time (zxcvbn), Shannon entropy as well as N-gram scoring to produce strict ground-truth labels of the RockYou data. The results of the experiment with invisible test data indicate that the accuracy of classification is 81.79 percent, which is indicative of real performance on noisy, real-world data as opposed to the artificially inflated values seen in overfitted models. Moreover, the model is very sensitive to identifying weak passwords (Recall: 0.94), which minimizes the chances of false positives, when weak credentials are mistaken as strong. The paper confirms the use of deep learning in the cryptanalysis process and suggests a practical customer-side application that allows to improve security in real time
Keywords : Password Security, Bi-LSTM, Deep Learning, Cryptanalysis, Human-Computer Interaction, zxcvbn.













