The Misadventures of Quinxy truths, lies, and everything in between!

4Nov/110

Windows Tip #931: Slow File Copy in Windows 7? Close the Performance Monitor!!!

Today I had a very bizarre problem. I was trying to copy a large (50 GB) file from a laptop's hard drive to an external USB drive. I'd already copied an even larger file (150 GB) one just a few hours earlier without incident. But this 50 GB file would begin at full speed (about 30 MB/s) and then when start to get progressively slower as it approached about 8 GB and would essentially be stalled by the time it reached 9 or 10 GB transferred, slowing to the point where it was clear the transfer would never get any further. I tried every recommendation I could find relating to slow copying and external USB drives, I updated the external USB drive's firmware, I set the drive to be optimized for performance, I tried rebooting, using several different copy methods, and always the same result. Because of some initial slowness with this 50 GB transfer I'd begun using the Windows Performance Monitor to watch just what might be slowing down the file copy. This allowed me to resolve the initial problem, Raxco's PerfectDisk was trying to defrag as I was doing the copy. But after PerfectDisk was off the problem remained, or at least persisted in a slightly different form. One odd thing I noticed in Performance Monitor was that the wait for the drives in question would be at 0 and then suddenly jump to 5 seconds, and back, all while the disk appeared to be doing almost nothing. After a while I used System Internal's great Process Monitor software to let me know of anything that involved the path of the source or target disks and there it was, perfmon.exe was the only other thing accessing the target drive. I shut perfmon.exe down and the speed went from the languid 40 KB/s it had become back to the normal 31 MB/s. Apparently perfmon.exe has a little problem!

So the lesson of the day is: Don't leave perfmon.exe running (for long) when doing big file copies!

^ Quinxy

12Mar/1145

Running Google Android 2.2 on Your PC with VMware in Less than 5 Minutes

If you're serious about playing around with Android I urge you to check out my article on how you can convert a $249 Barnes & Noble Nook Color e-reader into a full Android tablet!  I just did it and it's turning out to be one of the coolest gadgets I've had!

Tonight I wanted to play around with the Google Android OS for mobile devices, but having neither an Android tablet or phone I was forced to investigate how I could run it on my computer.  I found the answer I was looking for and succeeded in running it on my PC.  And here is my super quick guide on how you can do it, too.

Step 0.

You will need the virtual machine software VMware Player or VMware Workstation.  If you don't have either, you can download and install VMware Player for free.

Step 1.

Grab the Android Live ISO, the one to use is the Asus Eee PC version. (I tried the generic version and it wouldn't even boot under VMware.) You can navigate to the latest version here or just use this direct link for the 2.2 version.

Step 2.

Configure the VMware Player or VMware Workstation options for this VM. You want to choose:

  • CD/DVD pointed at the ISO file you just downloaded for Android
  • 512 MB memory
  • Any network setting should work (BUT, you will need to follow the instructions in step 3)
  • Sound card should be changed to "SB X-Fi Audio"
  • 2 GB IDE hard disk (optional)

Step 3.

With the VM powered off, modify the .vmx file that VMware created using a text editor.  You MUST change the existing line to now read:

ethernet0.virtualDev = "vlance"

If you don't make this change you will have no network access in Android!

Step 4.

Power on the Android VM and from the bootloader screen choose the first option and everything should work!

 

Making it Permanent

The above works great for getting a feel for Android, but because this is a "live" version of Android using a ram disk for temporary storage, all your changes will be lost when you shutdown or reboot.  To make your environment permanent it's actually very easy:

  • Reboot the virtual machine (Power > Reset in VMware)
  • Choose the "Install to hard disk" option from the bootloader
  • Create a single primary partition in the partition editor, using all available space.  Make the partition bootable.  Quit the partition editor.
  • Allow it to install the OS to the selected partition, using ext3.
  • Allow the installer to use Grub as your boot loader.
  • Do not attempt to create a virtual SD card (I didn't investigate how this works, so when I tried it it appeared to overwrite the OS I just wrote to disk.  So don't do this unless you know what you're doing.)
  • Choose to Run Android x86 when asked.

And now you've got a permanent Android x86 virtual machine!

Notes

Certain features are not supported by Android x86, primarily those applications which require devices missing from the virtual machine (e.g., the camera).  Other applications such as the YouTube application appear to work except that it does not seem to play videos; I suspect this may have to do with specific  hardware acceleration missing from the virtualization.  Also, see the many debugging and virtualization related options in the app list; you can do things like spoof geolocation.  While limited in some respects, this is an excellent tool for testing and debugging your web and mobile apps on Android.

Have fun playing around with it!

^ Quinxy

26Feb/112

A New Approach to CAPTCHA: Ask My Computer a Question, Not Me

Being able to tell a human from a computer is pretty important. While we don't yet need to worry about artificial intelligence run amok, experience has taught us that we do need to worry about the trouble humans can create when they can automate their mischief. These little tests computers give us, called CAPTCHAs (short for Completely Automated Public Turing test to tell Computers and Humans Apart), are used to prevent automated scripts from doing such things as creating millions of bogus Gmail accounts from which to spam us good e-citizens all to hell.  For years now the technique employed for CAPTCHA has been a set of distorted characters we humans are required to use our powerful brains to interpret and then type in.  We humans are still better at pattern recognition than mere computers and so even the dullest of us pass the test, most of the time.  For years CAPTCHAs asked these pointless questions and compared them with known pointless answers.  A new variant arrived,  reCAPTCHA, to at least marry these potentially good labors of man to potentially good works, allowing us to band together in transcribing difficult to read words found in old books.  A hundred years or so of the New York Times has been transcribed in just such a fashion.  And while that's all well and good and has been making the best of a bad situation, the truth remains that CAPTCHAs as thus far implemented are still awfully rotten things, wasting our precious little time on this big blue marble proving we are human, which is something that just shouldn't need to prove to anyone or anything.  A truly great CAPTCHA shouldn't need to ask us any question, it should just size us up as human beings and declare with authority, "By god, this is a human being sitting in front of me."  Or if it couldn't do that, couldn't detect something approximating our soul, it should at least employ some other solution to solve the same basic problem of gating access to a resource without needing anything from our person.

I imagined just such a solution in 1997, and surely some towering genius imagined it even earlier than that.  I'm surprised no one has developed it in the year since.  My schedule is so painfully busy I haven't bothered, that and my woeful disorganization and disinterest in the minutiae required to achieve its perfect implementation.  Still, I have the concept, and I am fully convinced of its probable efficacy.  🙂  And if I can't bother creating it at least I can bitch and moan that no one else has either.  And hope for that one day when someone will, because I think the ultimate solution must resemble the one I am here proposing.

First we must admit one thing, no CAPTCHA is perfect, and none make serious attempts to be.  CAPTCHAs are a balance between maximizing the thwarting of automated scripts while minimizing the annoyance of the humans who need to type in the often hard to interpret letters.  If you make the images too hard to read people will become so annoyed that they'll begin to avoid your website, and if you make them too easy the fake users on your site will exceed the real ones.  So a CAPTCHA solution need not be perfect, it simply must prevent severe abuses.  The key point in an example involving a site like Gmail is that if you limit the number of accounts created by an automated script, then the limits separately applied to each account will prevent the serious problems.  For example, an automated script today may be able to guesstimate enough reCAPTCHAs to create a few hundred fake Gmail accounts a day, but if each Gmail account is limited to only sending out a hundred emails a day, then this spammer won't be able to send enough spam to earn himself a steak dinner.

With that explanation of the real expectations of CAPTCHAs I can reveal my super top secret solution...

Instead of requiring a human to view, decipher, and type in a code, require that the user's computer do some fiendishly complicated math the solution it then passes silently to the website in question at the moment your humanness would otherwise be proved.  This CAPTCHA is obviously not living up to its name in proving  you are a human, but what it is doing is requiring you to prove that you have sufficient computing resources to expend on acquiring some resource you want.  To add further protection, the model would be further changed such that executing particular web transactions would also incur this penalizing CPU burden. These computational tokens wouldn't merely be required for creating accounts, they would also be required to download files, send emails, post comments, etc.  The actual math done behind the scenes by the formula I propose is beyond the scope of this document but it would be of the sort where the server delivers a JavaScript (or Java applet with JavaScript glue) with the formula and some input parameters embedded and at the end of some intense computation the solution requested is found and returned to the server which could instantly verify the solution (testing the solution for validity must not require recomputing the problem, it would instead merely involve applying the solution to see if it solved it).  The code in question would execute silently in the background as you accessed the website, ideally beginning its execution long before it was needed.  For example, the first time you land on Gmail.com the script could begin computing  the token.  As you changed pages it would halt and then restart the computation, making gradual progress.  By the time you completed the sign up form it would have the solution token computed and ready to use, without your needing to wait at all.  In the worst case scenario where you might go directly to the sign up page and fill it out with super speed, you might need to wait 10-15 seconds for the token to be generated and exchanged.  Continuing this example, once you've signed up for your account you would use Gmail as always but now, without your being aware of it, every time you send an email a similar token generation and exchange would take place. By demanding these significant chunks of computing resources from each user for each important transaction you are ensuring that nefarious persons who have limited resources would be unable to inflict significant damage.  As with any CAPTCHA system automation could bypass the system to a degree, but only a very limited degree, with very limited consequences.  With this sort of approach we could all be liberated from the traditional image-based CAPTCHA, never again wasting our precious time in the service of proving our very humanness.  We can finally defeat the bogus philosophy of Rene Descartes 2.0 and his, "I am, therefore I will type these characters."  We can be free to waste our life with reality TV, internet pr0n, and attend furry conventions.  But seriously, we should do this.

And now some additional notes and caveats I didn't feel like crafting elegantly into seamlessly arranged paragraphs above:

  • Regarding the Formula...  I am no expert in the math required here, so I can only describe the features we'd need from such a formula.  I strongly suspect these sorts of formulas could be created, its features remind me of those used in cryptography and prime number computation (difficult computation, easy testing of the solution, difficult or impossible to shortcut).
  • How hard should the math problem be?  I haven't computed how many CPU cycles should be required, and presumably it would vary depending on the resource you were trying to protect.  Important in the implementation would be that the computer and the browser feel responsive, despite the intense labor that is being required of it.  This responsiveness might be the bugbear in any implementation, particularly one that uses JavaScript exclusively.  Implementation would probably require the computation portion be done in Java with communication between it and the web page taking place via JavaScript.  and perhaps even Java would deliver an unpleasant experience, which is why I would argue that the real place for this solution is in the browser itself, via some standard.  Computers can quite pleasantly do intense work without negatively affecting user experience if the process doing that work is executing at a lower CPU priority, but such things cannot be done in JavaScript (except by poor simulation, adding pauses in script execution) and perhaps not with a Java applet (I don't recall, I suspect it can be done, but I just don't remember).
  • And now for the possible fly in the ointment...  It cannot be denied that there are botnets which could be employed against just this sort of defensive strategy.  A botnet operator could potentially direct the activities of tens of thousands of computers toward the purpose of bypassing these protections, stealing chunks of CPU cycles from each computer for the proxy computation of a separate automation script.  I have a few ideas about how to mitigate this risk, but I hesitate to share those nascent ideas prematurely.  The risk of this sort of botnet attack isn't known, nor is how this threat would compare to the threat of a similar botnet using a similar attack against traditional CAPTCHA images using traditional OCR techniques.  The goal is to have this alternate approach be no worse than what currently exists and in fact potentially quite a bit better, and I think that's achievable.

^ Quinxy

24Jan/115

.Net (dotnet) Install Base Statistics – Picking Your Framework Version

When building C# applications in Microsoft's Visual Studio 2011 you must make a choice, which .Net framework do you target?  As a developer we'd all love to target the latest framework which has all the latest features, but you don't want to lose out on a huge percentage of potential users who balk at the torturous Microsoft .Net installation.  If a user doesn't have the needed version of the .Net framework they'll be forced to download an additional 25+ MB file, forced to wait as that framework is installed, then forced to close all their applications and reboot their computer.  With past applications we've deployed we've seen as many as 30% of people abandon our software's installation in order to avoid the .Net installation.  Picking your framework is, therefore, important.  Target the wrong one and you'll lose more users than you need.

It has been difficult to determine just want version to target as Microsoft hardly wants to publish the anemic numbers themselves, and no one else seems much interested in sharing their own.  We included a .Net version checker in some of the software we distribute to find the truth ourselves... and here it is.  These numbers represent the users coming to DriverGuide.com, so other sites with different audiences would likely see different numbers, but I suspect they'd not be too far off.

.Net Version Percentage of Computers Scanned
pre 1.1 (or none) 21.6%
1.1 no service pack 27.2%
1.1 SP 1 24.6%
2.0 no service pack 74.2%
2.0 SP 1 1.7%
2.0 SP 2 62.7%
3.0 no service pack 62.4%
3.0 SP 1 1.0%
3.0 SP 2 61.5%
3.5 no service pack 61.6%
3.5 SP 1 61.3%
4.0 client install 28.3%
4.0 full install 5.6%

Unless you can be sure your users will tolerate the painful .Net framework installations this data clearly shows you only have two real choices, either target 2.0 and seriously inconvenience 25% of people or choose 3.5 and seriously inconvenience 38%.  Neither are good options, but they are what Microsoft leaves you with.

One alternative which we've flirted with in the past is to bypass the horrific .Net installation requirement was using Xenocode's Postbuild software to create a distributable that was free of the .Net installation because the exe built with Postbuild included the necessary .Net components inside a virtual machine wrapper.  It worked beautifully, but made our tiny 150 KB executable into a self-compressed 25 MB executable.  Still, in this day and age of broadband Internet one could hardly argue it wasn't a far better user experience.  Xenocode Postbuild has since morphed into a more encompassing idea with a slightly different built utility called Spoon Studio which we are currently testing; we've found similar results so far.

Virtualization was done with Spoon Studio and bundled the .Net 3.5 Client Profile.

Compression Method Size
Raw, Spoon Not Used 150 KB
Streamable / No Compression 50 MB
Not-Streamable / Compressed by Spoon 18 MB
Not-Streamable / Compressed as RAR SFX 15 MB
Not-Streamable / Compressed as 7-Zip SFX 13.5 MB

We will likely do what we've done before and deploy a tiny installer which bundles only in only the raw .Net dependent executable and then for the 40% of people we see don't have the right .Net version we'll automatically fetch the larger monolithic exe they'll need over the net.  It's as near to an ideal situation as we can have when Microsoft insists on making .Net installation so horrible.

^Quinxy