Table of Contents
Parsing and filtering the YouPorn password leak
This article is about parsing the leaked files related to the YouPorn incident of February 2012. You can read about the incident here.
Parsing and importing into MySQL
The interesting information seem to only be formatted in one of two ways. This is how you could extract that information and insert it into a MySQL table:
- parse.php
<?php $db = new PDO('mysql:dbname=cracking;host=127.0.0.1', 'root', ''); $db->query('SET NAMES utf8mb4'); $db->beginTransaction(); $fileCount = 1; foreach (glob('*.log') as $filename) { $i = 0; $string = file_get_contents($filename); if (preg_match_all('/username=(.*)\semail=(.*)\spassword=(.*)\spassword_confirm=(.*)\s/', $string, $matches)) { $statement = $db->prepare("INSERT INTO `cracking`.`2012-02-22-youporn` VALUES (?, ?, NULL, ?, ?)"); foreach ($matches[1] as $k => $v) { $statement->execute(array($matches[1][$k], $matches[2][$k], $matches[3][$k], $matches[4][$k])); $i++; } } if (preg_match_all('/username=(.*)\semail=(.*)\semail_confirm=(.*)\spassword=(.*)\spassword_confirm=(.*)\s/', $string, $matches)) { $statement = $db->prepare("INSERT INTO `cracking`.`2012-02-22-youporn` VALUES (?, ?, ?, ?, ?)"); foreach ($matches[1] as $k => $v) { $statement->execute(array($matches[1][$k], $matches[2][$k], $matches[3][$k], $matches[4][$k], $matches[5][$k])); $i++; } } echo 'File #', $fileCount, " (", $filename, "). ", "Executed ", $i, " statements. \n"; $fileCount++; } $db->commit();
For this to work you obviously need a table like this:
- table.sql
CREATE TABLE `cracking`.`2012-02-22-youporn` ( `username` blob, `email` blob, `email_confirm` blob, `password` blob, `password_confirm` blob, KEY `username` (`username`(128)), KEY `email` (`email`(128)), KEY `email_confirm` (`email_confirm`(128)), KEY `password` (`password`(128)), KEY `password_confirm` (`password_confirm`(128)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
Filtering information
We want to exclude form submits that resulted in an error and did not actually create a new user. Let's fetch account creation attempts where:
- password = password_confirm
- excluding missing password
- excluding missing email address
- excluding duplicate combinations of email and password
- filter.sql
SELECT CONVERT(username USING utf8mb4), CONVERT(email USING utf8mb4), CONVERT(email_confirm USING utf8mb4), CONVERT(password USING utf8mb4), CONVERT(password_confirm USING utf8mb4) FROM `cracking`.`2012-02-22-youporn` WHERE password = password_confirm AND password != '' AND email != '' GROUP BY email, password
Exporting passwords and running statistical analysis
You could also export the passwords to a file to run statistical analysis tools, like Passpal 0.4, on:
Passpal 0.4 dump
Running Passpal you might get a report like this (here's a link to the report by itself):
passpal 0.4 report (www.thepasswordproject.com) Total words: 1566156 Unique words: 833994 (53.25 %) Word frequency, sorted by count, top 10 +------------------------------+ | Word | Count | Of total | +------------------------------+ | 123456 | 37981 | 2.4251 % | | 123456789 | 10415 | 0.665 % | | 12345 | 6921 | 0.4419 % | | 1234 | 4725 | 0.3017 % | | password | 4361 | 0.2785 % | | 12345678 | 3235 | 0.2066 % | | qwerty | 3203 | 0.2045 % | | 1234567 | 2982 | 0.1904 % | | 111111 | 2354 | 0.1503 % | | 123 | 2277 | 0.1454 % | +------------------------------+ Base word (len>=3) frequency, sorted by count, top 10 +-----------------------------+ | Word | Count | Of total | +-----------------------------+ | password | 5826 | 0.372 % | | qwerty | 4230 | 0.2701 % | | youporn | 2601 | 0.1661 % | | sex | 1928 | 0.1231 % | | abc | 1873 | 0.1196 % | | pussy | 1838 | 0.1174 % | | fuckyou | 1499 | 0.0957 % | | ficken | 1339 | 0.0855 % | | sexy | 1305 | 0.0833 % | | love | 1264 | 0.0807 % | +-----------------------------+ Length frequency, sorted by length, full table +-----------------------------+ | Length | Count | Of total | +-----------------------------+ | 0 | 1 | 0.0001 % | | 1 | 865 | 0.0552 % | | 2 | 3164 | 0.202 % | | 3 | 15949 | 1.0184 % | | 4 | 65196 | 4.1628 % | | 5 | 81589 | 5.2095 % | | 6 | 413271 | 26.3876 % | | 7 | 255748 | 16.3297 % | | 8 | 325944 | 20.8117 % | | 9 | 173534 | 11.0802 % | | 10 | 118276 | 7.552 % | | 11 | 50045 | 3.1954 % | | 12 | 29770 | 1.9008 % | | 13 | 13806 | 0.8815 % | | 14 | 8314 | 0.5309 % | | 15 | 4539 | 0.2898 % | | 16 | 2920 | 0.1864 % | | 17 | 1074 | 0.0686 % | | 18 | 747 | 0.0477 % | | 19 | 321 | 0.0205 % | | 20 | 358 | 0.0229 % | | 21 | 156 | 0.01 % | | 22 | 105 | 0.0067 % | | 23 | 68 | 0.0043 % | | 24 | 47 | 0.003 % | | 25 | 37 | 0.0024 % | | 26 | 32 | 0.002 % | | 27 | 23 | 0.0015 % | | 28 | 10 | 0.0006 % | | 29 | 12 | 0.0008 % | | 30 | 8 | 0.0005 % | | 31 | 11 | 0.0007 % | | 32 | 216 | 0.0138 % | +-----------------------------+ Charset frequency, sorted by count, full table +-------------------------------------------------------------------------+ | Charset | Count | Of total | Count/keyspace | +-------------------------------------------------------------------------+ | lower-upper-numeric-symbolic | 1560739 | 99.6541 % | 16428.831578947367 | | lower-upper-numeric | 1532315 | 97.8392 % | 24714.75806451613 | | lower-numeric-symbolic | 1476935 | 94.3032 % | 21404.855072463768 | | lower-numeric | 1454060 | 92.8426 % | 40390.555555555555 | | lower-upper-symbolic | 763053 | 48.7214 % | 8977.094117647059 | | lower-upper | 750871 | 47.9436 % | 14439.826923076924 | | lower-symbolic | 722999 | 46.1639 % | 12254.22033898305 | | lower | 712706 | 45.5067 % | 27411.76923076923 | | upper-numeric-symbolic | 339397 | 21.6707 % | 4918.797101449275 | | upper-numeric | 336290 | 21.4723 % | 9341.388888888889 | | numeric-symbolic | 309025 | 19.7314 % | 7186.627906976744 | | numeric | 307021 | 19.6035 % | 30702.1 | | upper-symbolic | 17832 | 1.1386 % | 302.23728813559325 | | upper | 16904 | 1.0793 % | 650.1538461538462 | | symbolic | 434 | 0.0277 % | 13.151515151515152 | +-------------------------------------------------------------------------+ Charset frequency, sorted by count/keyspace, full table +-------------------------------------------------------------------------+ | Charset | Count | Of total | Count/keyspace | +-------------------------------------------------------------------------+ | lower-numeric | 1454060 | 92.8426 % | 40390.555555555555 | | numeric | 307021 | 19.6035 % | 30702.1 | | lower | 712706 | 45.5067 % | 27411.76923076923 | | lower-upper-numeric | 1532315 | 97.8392 % | 24714.75806451613 | | lower-numeric-symbolic | 1476935 | 94.3032 % | 21404.855072463768 | | lower-upper-numeric-symbolic | 1560739 | 99.6541 % | 16428.831578947367 | | lower-upper | 750871 | 47.9436 % | 14439.826923076924 | | lower-symbolic | 722999 | 46.1639 % | 12254.22033898305 | | upper-numeric | 336290 | 21.4723 % | 9341.388888888889 | | lower-upper-symbolic | 763053 | 48.7214 % | 8977.094117647059 | | numeric-symbolic | 309025 | 19.7314 % | 7186.627906976744 | | upper-numeric-symbolic | 339397 | 21.6707 % | 4918.797101449275 | | upper | 16904 | 1.0793 % | 650.1538461538462 | | upper-symbolic | 17832 | 1.1386 % | 302.23728813559325 | | symbolic | 434 | 0.0277 % | 13.151515151515152 | +-------------------------------------------------------------------------+ Hashcat mask frequency, sorted by count, top 10 +------------------------------------------------------------------------+ | Mask | Count | Of total | Count/keyspace | +------------------------------------------------------------------------+ | ?l?l?l?l?l?l | 194284 | 12.4051 % | 0.0006289222341302504 | | ?d?d?d?d?d?d | 136252 | 8.6998 % | 0.136252 | | ?l?l?l?l?l?l?l | 131012 | 8.3652 % | 1.6311640480682596e-05 | | ?l?l?l?l?l?l?l?l?l?l?l?l | 129221 | 8.2508 % | 1.354106809090646e-12 | | ?l?l?l?l?l?l?l?l?l | 67597 | 4.3161 % | 1.2449940914810972e-08 | | ?l?l?l?l?l | 54885 | 3.5044 % | 0.004619414451659471 | | ?d?d?d?d?d?d?d?d?l?l?l?l | 51634 | 3.2969 % | 1.129906165750499e-09 | | ?l?l?l?l?l?l?l?l?l?l | 42464 | 2.7114 % | 3.0080664196893873e-10 | | ?l?l?l?l?l?l?d?d?l?l?l?l | 39088 | 2.4958 % | 2.7689172054638933e-12 | | ?l?l?l?l?l?l?l?l | 35665 | 2.2772 % | 1.7078724959532325e-07 | +------------------------------------------------------------------------+ Words that didn't match any ?l?u?d?s mask: 5417 (0.3459 %) Hashcat mask frequency, sorted by count/keyspace, top 10 +-------------------------------------------------+ | Mask | Count | Of total | Count/keyspace | +-------------------------------------------------+ | ?d | 279 | 0.0178 % | 27.9 | | ?l | 534 | 0.0341 % | 20.53846153846154 | | ?d?d | 655 | 0.0418 % | 6.55 | | ?d?d?d | 3849 | 0.2458 % | 3.849 | | ?l?l | 2215 | 0.1414 % | 3.276627218934911 | | ?s | 29 | 0.0019 % | 0.8787878787878788 | | ?u | 19 | 0.0012 % | 0.7307692307692307 | | ?l?l?l | 10982 | 0.7012 % | 0.6248293126991352 | | ?l?d | 62 | 0.004 % | 0.23846153846153847 | | ?u?u | 132 | 0.0084 % | 0.1952662721893491 | +-------------------------------------------------+ Words that didn't match any ?l?u?d?s mask: 5417 (0.3459 %) Charset distribution of characters in beginning and end of words (len>=6) +-------------------------------------------------------------------------------------------------+ | Charset\Index | 0 (first char) | 1 | 2 | -3 | -2 | -1 (last char) | +-------------------------------------------------------------------------------------------------+ | lower | 72.4842 % | 75.0555 % | 74.7783 % | 65.2347 % | 55.6883 % | 50.3243 % | | upper | 4.976 % | 1.9758 % | 1.9396 % | 1.6236 % | 1.4058 % | 1.3221 % | | digits | 22.3639 % | 22.7691 % | 23.0835 % | 32.7727 % | 42.5487 % | 47.6496 % | | symbols | 0.1759 % | 0.1996 % | 0.1986 % | 0.369 % | 0.3572 % | 0.7039 % | +-------------------------------------------------------------------------------------------------+ Total characters: 11727903 Unique characters: 356 Top 50 characters: ae1oirns2l0t3m9c45d68uh7bkpgyfwjvxzqAESORINLMTCDBP Character frequency, sorted by count, top 10 +-------------------------------+ | Character | Count | Of total | +-------------------------------+ | a | 880245 | 7.5056 % | | e | 754693 | 6.435 % | | 1 | 648644 | 5.5308 % | | o | 583786 | 4.9778 % | | i | 534381 | 4.5565 % | | r | 514377 | 4.3859 % | | n | 508634 | 4.337 % | | s | 490661 | 4.1837 % | | 2 | 461183 | 3.9324 % | | l | 435806 | 3.716 % | +-------------------------------+ Symbol frequency, sorted by count, top 10 +----------------+ | Symbol | Count | +----------------+ | . | 8299 | | ! | 5150 | | _ | 4819 | | - | 4616 | | | 4332 | | @ | 4007 | | * | 2326 | | # | 1623 | | / | 1291 | | , | 1269 | +----------------+