前言
近期上班有點忙,沒有太多空閒時間能學新東西
剛好前陣子蠻常遇到 java 反序列化,就用下班後的零碎時間稍微小跟了一下 readObject()
底層流程
雖然都是萬年老梗內容,但還是順手筆記一下追 code 的過程
大家都很熟 readObject 用法,但應該很少人實際去追過底層 (?)
(同時也順便更新一下很久沒放技術文的 Blog XD)
序列化/反序列化
- 序列化: 把物件轉成Bytes sequences
- 反序列化: 把Bytes sequences還原成物件
這樣做的目的,可以方便我們將物件狀態保存起來,或是用於網路傳輸中(常見於分散式架構),向不同台機器傳遞物件狀態
序列化機制在 Java 中應用非常廣泛,例如常見的 RMI、JMX、EJB 等都以此為基礎
Java 的反序列化跟 PHP 或其他語言的反序列化機制一樣,若反序列化的內容為使用者可控,將有機會導致安全問題
漏洞歷史
Java 反序列漏洞最為人知的就是 2015 年 FoxGlove Security 提出的 Apache Commons Collections 反序列化漏洞
因為 Common Collections 是一個被廣泛使用的第三方套件包
所以當時造成的影響範圍非常大,包括 WebSphere, JBoss, Jenkins, WebLogic 等都受到此漏洞影響
具體可以參考原文: https://foxglovesecurity.com/2015/11/06/what-do-weblogic-websphere-jboss-jenkins-opennms-and-your-application-have-in-common-this-vulnerability/
也就是說只要找到一個反序列化的入口點,再滿足 classpath 中有低版本 common collections 套件,就能直接走這條 gadget chain 達到 RCE
神器 ysoserial 就佛心整理了各版本 Common collections 和其它套件的 gadget chain,可以直接拿來爽爽打
readObject
在 PHP 裡面,我們可以透過 unserialize(input)
去對 input 做反序列化
而在 Java 中,通常會透過 ObjectInputStream.readObject()
作為反序列化的起始點
並且物件必須實作 java.io.Serializable
才能被序列化
直接看例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| import java.io.IOException; import java.io.ObjectInputStream; import java.io.Serializable; public class Kaibro implements Serializable { public String gg; public Kaibro() { gg = "meow"; } private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { in.defaultReadObject(); System.out.println("QQ"); } }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.ObjectInputStream; import java.io.ObjectOutputStream; public class main { public static void main(String args[]) throws Exception { Kaibro kb = new Kaibro(); ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("/tmp/ser")); out.writeObject(kb); ObjectInputStream ois = new ObjectInputStream(new FileInputStream("/tmp/ser")); Kaibro tmp = (Kaibro)ois.readObject(); System.out.println(tmp.gg); } }
|
可以看到我們透過 ObjectOutputStream.writeObject()
把 kb
物件序列化存放到 /tmp/ser
之後透過 ObjectOutputStream.readObject()
把 /tmp/ser
讀出來做反序列化
並且可以注意到 Kaibro
class 中也有一個同名的 readObject()
方法
這個方法的作用是,讓開發者可以自定義物件反序列化還原的邏輯
以 HashMap 為例,它為了保持反序列化後,物件的狀態能夠一致,所以重寫了 readObject 方法來處理反序列化
而如果覆寫的 readObject 方法中有其他方法可以讓我們繼續利用的話,就有機會串下一個 gadget,最後形成一條完整的 gadget chain
例如 ysoserial 中 URLDNS
這條 gadget chain 就利用到 HashMap
readObject 中的 putVal()
, hash()
等方法達到發送 DNS 請求的效果
看到這裡,應該有的人會有疑問:
ObjectInputStream.readObject()
之後,到底發生什麼事,又為何最後會呼叫到我們重寫的Kaibro.readObject()
後面就讓我們來跟一下 JDK 原始碼,看一下背後到底做了啥事情
分析
下面的內容,會以 JDK 8 來當作分析的目標
而在分析之前,我們先用SerializationDumper這個工具看一下前面例子造出來的序列化內容的結構:
開頭兩個 Bytesac ed
標示這是一個 Java 序列化 Stream
後面的兩個 Bytes 00 05
則是版本號
1 2 3 4 5
| $ cat /tmp/ser | xxd 00000000: aced 0005 7372 0006 4b61 6962 726f e9d6 ....sr..Kaibro.. 00000010: ae3b 5461 820d 0200 014c 0002 6767 7400 .;Ta.....L..ggt. 00000020: 124c 6a61 7661 2f6c 616e 672f 5374 7269 .Ljava/lang/Stri 00000030: 6e67 3b78 7074 0004 6d65 6f77 ng;xpt..meow
|
1 2
| $ cat /tmp/ser | base64 rO0ABXNyAAZLYWlicm/p1q47VGGCDQIAAUwAAmdndAASTGphdmEvbGFuZy9TdHJpbmc7eHB0AARtZW93
|
所以當我們在測試 Java Web 應用時,只要看到 ac ed 00 05 ...
或是 rO0AB...
(Base64) 等特徵
就可以猜測它 87% 是序列化 Stream,可以嘗試做進一步的反序列化利用
接下來直接從 ObjectInputStream.readObject()
下手,跟進 Source code:
1 2 3 4
| public final Object readObject() throws IOException, ClassNotFoundException { return readObject(Object.class); }
|
這裡直接回傳 readObject(Object.class)
,繼續跟進:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| private final Object readObject(Class<?> type) throws IOException, ClassNotFoundException { if (enableOverride) { return readObjectOverride(); } if (! (type == Object.class || type == String.class)) throw new AssertionError("internal error"); int outerHandle = passHandle; try { Object obj = readObject0(type, false); handles.markDependency(outerHandle, passHandle); ClassNotFoundException ex = handles.lookupException(passHandle); if (ex != null) { throw ex; } if (depth == 0) { vlist.doCallbacks(); } return obj; } finally { passHandle = outerHandle; if (closed && depth == 0) { clear(); } } }
|
開頭有一個 if 判斷式,其中的 enableOverride
來自 ObjectInputStream
的 constructor:
1 2 3 4 5
| public ObjectInputStream(InputStream in) throws IOException { ... enableOverride = false; ... }
|
只要是由帶參數的 constructor 建立的 ObjectInputStream 實例,這個變數值預設就是 false
當 constructor 沒有參數時,才會將 enavleOverride
設成 true:
1 2 3 4 5
| protected ObjectInputStream() throws IOException, SecurityException { ... enableOverride = true; ... }
|
而條件成立後的readObjectOverride()
實際上也只是個空函數,沒有任何作用
接著繼續看:
1
| Object obj = readObject0(type, false);
|
跟進去 readObject0()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111
| * Underlying readObject implementation. * @param type a type expected to be deserialized; non-null * @param unshared true if the object can not be a reference to a shared object, otherwise false */ private Object readObject0(Class<?> type, boolean unshared) throws IOException { boolean oldMode = bin.getBlockDataMode(); if (oldMode) { int remain = bin.currentBlockRemaining(); if (remain > 0) { throw new OptionalDataException(remain); } else if (defaultDataEnd) { * Fix for 4360508: stream is currently at the end of a field * value block written via default serialization; since there * is no terminating TC_ENDBLOCKDATA tag, simulate * end-of-custom-data behavior explicitly. */ throw new OptionalDataException(true); } bin.setBlockDataMode(false); } byte tc; while ((tc = bin.peekByte()) == TC_RESET) { bin.readByte(); handleReset(); } depth++; totalObjectRefs++; try { switch (tc) { case TC_NULL: return readNull(); case TC_REFERENCE: return type.cast(readHandle(unshared)); case TC_CLASS: if (type == String.class) { throw new ClassCastException("Cannot cast a class to java.lang.String"); } return readClass(unshared); case TC_CLASSDESC: case TC_PROXYCLASSDESC: if (type == String.class) { throw new ClassCastException("Cannot cast a class to java.lang.String"); } return readClassDesc(unshared); case TC_STRING: case TC_LONGSTRING: return checkResolve(readString(unshared)); case TC_ARRAY: if (type == String.class) { throw new ClassCastException("Cannot cast an array to java.lang.String"); } return checkResolve(readArray(unshared)); case TC_ENUM: if (type == String.class) { throw new ClassCastException("Cannot cast an enum to java.lang.String"); } return checkResolve(readEnum(unshared)); case TC_OBJECT: if (type == String.class) { throw new ClassCastException("Cannot cast an object to java.lang.String"); } return checkResolve(readOrdinaryObject(unshared)); case TC_EXCEPTION: if (type == String.class) { throw new ClassCastException("Cannot cast an exception to java.lang.String"); } IOException ex = readFatalException(); throw new WriteAbortedException("writing aborted", ex); case TC_BLOCKDATA: case TC_BLOCKDATALONG: if (oldMode) { bin.setBlockDataMode(true); bin.peek(); throw new OptionalDataException( bin.currentBlockRemaining()); } else { throw new StreamCorruptedException( "unexpected block data"); } case TC_ENDBLOCKDATA: if (oldMode) { throw new OptionalDataException(true); } else { throw new StreamCorruptedException( "unexpected end of block data"); } default: throw new StreamCorruptedException( String.format("invalid type code: %02X", tc)); } } finally { depth--; bin.setBlockDataMode(oldMode); } }
|
到這裡才真正開始處理序列化Stream中的內容
開頭的bin
變數一樣由 constructor 做初始化,其實可以把它想成是一個序列化 Stream 的讀取器
1 2 3 4 5 6 7 8 9
| private final BlockDataInputStream bin; public ObjectInputStream(InputStream in) throws IOException { ... bin = new BlockDataInputStream(in); ... bin.setBlockDataMode(true); }
|
BlockDataInputStream
是ObjectInputStream
底層的資料讀取類別,用來完成對序列化Stream的讀取
其分為兩種讀取模式: Default mode 和 Block mode
從 code 裡可以看到,如果是 Block mode,會檢查當前 block 是否有剩餘的 bytes,都沒有就轉 Default mode
接著 tc = bin.peekByte()
會去呼叫 PeekInputStream.peek()
這個 PeekInputStream
類別背後是繼承 InputStream
類別,最後呼叫的是 InputStream.read()
所以其實這邊的 tc
就是從序列化 Stream 中讀一個 Byte 出來
以我們前面Kaibro
class那個例子來說,根據 SerializationDumper 的結果,可以知道 tc
會走到 TC_OBJECT
這個分支
1 2 3 4 5
| case TC_OBJECT: if (type == String.class) { throw new ClassCastException("Cannot cast an object to java.lang.String"); } return checkResolve(readOrdinaryObject(unshared));
|
常數 TC_OBJECT
對應的整數是 0x73
(可參考src),代表讀進來的是個 object
繼續跟進 readOrdinaryObject(unshared)
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
| * Reads and returns "ordinary" (i.e., not a String, Class, * ObjectStreamClass, array, or enum constant) object, or null if object's * class is unresolvable (in which case a ClassNotFoundException will be * associated with object's handle). Sets passHandle to object's assigned * handle. */ private Object readOrdinaryObject(boolean unshared) throws IOException { if (bin.readByte() != TC_OBJECT) { throw new InternalError(); } ObjectStreamClass desc = readClassDesc(false); desc.checkDeserialize(); Class<?> cl = desc.forClass(); if (cl == String.class || cl == Class.class || cl == ObjectStreamClass.class) { throw new InvalidClassException("invalid class descriptor"); } Object obj; try { obj = desc.isInstantiable() ? desc.newInstance() : null; } catch (Exception ex) { throw (IOException) new InvalidClassException( desc.forClass().getName(), "unable to create instance").initCause(ex); } passHandle = handles.assign(unshared ? unsharedMarker : obj); ClassNotFoundException resolveEx = desc.getResolveException(); if (resolveEx != null) { handles.markException(passHandle, resolveEx); } if (desc.isExternalizable()) { readExternalData((Externalizable) obj, desc); } else { readSerialData(obj, desc); } handles.finish(passHandle); if (obj != null && handles.lookupException(passHandle) == null && desc.hasReadResolveMethod()) { Object rep = desc.invokeReadResolve(obj); if (unshared && rep.getClass().isArray()) { rep = cloneArray(rep); } if (rep != obj) { if (rep != null) { if (rep.getClass().isArray()) { filterCheck(rep.getClass(), Array.getLength(rep)); } else { filterCheck(rep.getClass(), -1); } } handles.setObject(passHandle, obj = rep); } } return obj; }
|
一開頭就直接呼叫 readClassDesc(false)
繼續跟進去:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| * Reads in and returns (possibly null) class descriptor. Sets passHandle * to class descriptor's assigned handle. If class descriptor cannot be * resolved to a class in the local VM, a ClassNotFoundException is * associated with the class descriptor's handle. */ private ObjectStreamClass readClassDesc(boolean unshared) throws IOException { byte tc = bin.peekByte(); ObjectStreamClass descriptor; switch (tc) { case TC_NULL: descriptor = (ObjectStreamClass) readNull(); break; case TC_REFERENCE: descriptor = (ObjectStreamClass) readHandle(unshared); break; case TC_PROXYCLASSDESC: descriptor = readProxyDesc(unshared); break; case TC_CLASSDESC: descriptor = readNonProxyDesc(unshared); break; default: throw new StreamCorruptedException( String.format("invalid type code: %02X", tc)); } if (descriptor != null) { validateDescriptor(descriptor); } return descriptor; }
|
這邊的程式邏輯就跟方法名描述的一樣,會嘗試從序列化 Stream 中,構造出 class descriptor
以我們這邊的例子來說,第一個 Byte 讀到的會是 TC_CLASSDESC
(0x72
),代表 Class Descriptor,就是一種用來描述類別的結構,包含類別名字、成員類型等資訊
所以接下來會呼叫 descriptor = readNonProxyDesc(unshared)
來讀出這個 class descriptor
一樣繼續跟進去:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
| * Reads in and returns class descriptor for a class that is not a dynamic * proxy class. Sets passHandle to class descriptor's assigned handle. If * class descriptor cannot be resolved to a class in the local VM, a * ClassNotFoundException is associated with the descriptor's handle. */ private ObjectStreamClass readNonProxyDesc(boolean unshared) throws IOException { if (bin.readByte() != TC_CLASSDESC) { throw new InternalError(); } ObjectStreamClass desc = new ObjectStreamClass(); int descHandle = handles.assign(unshared ? unsharedMarker : desc); passHandle = NULL_HANDLE; ObjectStreamClass readDesc = null; try { readDesc = readClassDescriptor(); } catch (ClassNotFoundException ex) { throw (IOException) new InvalidClassException( "failed to read class descriptor").initCause(ex); } Class<?> cl = null; ClassNotFoundException resolveEx = null; bin.setBlockDataMode(true); final boolean checksRequired = isCustomSubclass(); try { if ((cl = resolveClass(readDesc)) == null) { resolveEx = new ClassNotFoundException("null class"); } else if (checksRequired) { ReflectUtil.checkPackageAccess(cl); } } catch (ClassNotFoundException ex) { resolveEx = ex; } filterCheck(cl, -1); skipCustomData(); try { totalObjectRefs++; depth++; desc.initNonProxy(readDesc, cl, resolveEx, readClassDesc(false)); } finally { depth--; } handles.finish(descHandle); passHandle = descHandle; return desc; }
|
這裡會先初始化一個 ObjectStreamClass
物件 desc
,他代表的就是序列化 class descriptor
接著後面呼叫 readClassDescriptor()
,它一樣會去初始化一個 ObjectStreamClass
物件
然後對這個物件呼叫 readNonProxy(this)
方法
跟進 readNonProxy()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
| * Reads non-proxy class descriptor information from given input stream. * The resulting class descriptor is not fully functional; it can only be * used as input to the ObjectInputStream.resolveClass() and * ObjectStreamClass.initNonProxy() methods. */ void readNonProxy(ObjectInputStream in) throws IOException, ClassNotFoundException { name = in.readUTF(); suid = Long.valueOf(in.readLong()); isProxy = false; byte flags = in.readByte(); hasWriteObjectData = ((flags & ObjectStreamConstants.SC_WRITE_METHOD) != 0); hasBlockExternalData = ((flags & ObjectStreamConstants.SC_BLOCK_DATA) != 0); externalizable = ((flags & ObjectStreamConstants.SC_EXTERNALIZABLE) != 0); boolean sflag = ((flags & ObjectStreamConstants.SC_SERIALIZABLE) != 0); if (externalizable && sflag) { throw new InvalidClassException( name, "serializable and externalizable flags conflict"); } serializable = externalizable || sflag; isEnum = ((flags & ObjectStreamConstants.SC_ENUM) != 0); if (isEnum && suid.longValue() != 0L) { throw new InvalidClassException(name, "enum descriptor has non-zero serialVersionUID: " + suid); } int numFields = in.readShort(); if (isEnum && numFields != 0) { throw new InvalidClassException(name, "enum descriptor has non-zero field count: " + numFields); } fields = (numFields > 0) ? new ObjectStreamField[numFields] : NO_FIELDS; for (int i = 0; i < numFields; i++) { char tcode = (char) in.readByte(); String fname = in.readUTF(); String signature = ((tcode == 'L') || (tcode == '[')) ? in.readTypeString() : new String(new char[] { tcode }); try { fields[i] = new ObjectStreamField(fname, signature, false); } catch (RuntimeException e) { throw (IOException) new InvalidClassException(name, "invalid descriptor for field " + fname).initCause(e); } } computeFieldOffsets(); }
|
小追一下 readUTF()
這部分的 code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
| public String readUTF() throws IOException { return readUTFBody(readUnsignedShort()); } public int readUnsignedShort() throws IOException { return bin.readUnsignedShort(); } public int readUnsignedShort() throws IOException { if (!blkmode) { pos = 0; in.readFully(buf, 0, 2); } else if (end - pos < 2) { return din.readUnsignedShort(); } int v = Bits.getShort(buf, pos) & 0xFFFF; pos += 2; return v; } public void readFully(byte[] b, int off, int len) throws IOException { readFully(b, off, len, false); } public void readFully(byte[] b, int off, int len, boolean copy) throws IOException { while (len > 0) { int n = read(b, off, len, copy); if (n < 0) { throw new EOFException(); } off += n; len -= n; } } public int read(byte[] buf, int off, int len) throws IOException { if (buf == null) { throw new NullPointerException(); } int endoff = off + len; if (off < 0 || len < 0 || endoff > buf.length || endoff < 0) { throw new IndexOutOfBoundsException(); } return bin.read(buf, off, len, false); } private String readUTFBody(long utflen) throws IOException { StringBuilder sbuf; if (utflen > 0 && utflen < Integer.MAX_VALUE) { int initialCapacity = Math.min((int)utflen, 0xFFFF); sbuf = new StringBuilder(initialCapacity); } else { sbuf = new StringBuilder(); } if (!blkmode) { end = pos = 0; } while (utflen > 0) { int avail = end - pos; if (avail >= 3 || (long) avail == utflen) { utflen -= readUTFSpan(sbuf, utflen); } else { if (blkmode) { utflen -= readUTFChar(sbuf, utflen); } else { if (avail > 0) { System.arraycopy(buf, pos, buf, 0, avail); } pos = 0; end = (int) Math.min(MAX_BLOCK_SIZE, utflen); in.readFully(buf, avail, end - avail); } } } return sbuf.toString(); }
|
這幾個方法基本上都是從序列化 Stream 讀資料的細部操作
所以 name = in.readUTF()
就是 Stream 中讀出這個 class descriptor 表示的 class 名字
下一行 suid = Long.valueOf(in.readLong())
就是讀出大家熟知的 serialVersionUID
大家都知道 serialVersionUID
是用在反序列化流程中,驗證版本是否一致的重要欄位
只要 serialVersionUID
不同,反序列化過程就會拋出異常
這裡就花點篇幅稍微小補充一下,serialVersionUID
的生成方式:
1 2 3 4 5 6 7
| * Writes non-proxy class descriptor information to given output stream. */ void writeNonProxy(ObjectOutputStream out) throws IOException { out.writeUTF(name); out.writeLong(getSerialVersionUID()); ...
|
這個方法在序列化過程中會被呼叫,其中 getSerialVersionUID()
會嘗試取得 suid
的值:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| * Return the serialVersionUID for this class. The serialVersionUID * defines a set of classes all with the same name that have evolved from a * common root class and agree to be serialized and deserialized using a * common format. NonSerializable classes have a serialVersionUID of 0L. * * @return the SUID of the class described by this descriptor */ public long getSerialVersionUID() { if (suid == null) { suid = AccessController.doPrivileged( new PrivilegedAction<Long>() { public Long run() { return computeDefaultSUID(cl); } } ); } return suid.longValue(); }
|
若 suid
值是 null,就進入 computeDefaultSUID(cl)
計算
計算 suid 時,會透過創立的 DataOutputStream
,將一些資訊寫入其包裝的 ByteArrayOutputStream
中:
1 2
| ByteArrayOutputStream bout = new ByteArrayOutputStream(); DataOutputStream dout = new DataOutputStream(bout);
|
寫入類別名字:
1
| dout.writeUTF(cl.getName());
|
寫入 modifier:
1 2 3 4 5 6 7 8 9 10
| int classMods = cl.getModifiers() & (Modifier.PUBLIC | Modifier.FINAL | Modifier.INTERFACE | Modifier.ABSTRACT); Method[] methods = cl.getDeclaredMethods(); if ((classMods & Modifier.INTERFACE) != 0) { classMods = (methods.length > 0) ? (classMods | Modifier.ABSTRACT) : (classMods & ~Modifier.ABSTRACT); } dout.writeInt(classMods);
|
照 interface name 排序之後寫入:
1 2 3 4 5 6 7 8 9 10 11
| if (!cl.isArray()) { Class<?>[] interfaces = cl.getInterfaces(); String[] ifaceNames = new String[interfaces.length]; for (int i = 0; i < interfaces.length; i++) { ifaceNames[i] = interfaces[i].getName(); } Arrays.sort(ifaceNames); for (int i = 0; i < ifaceNames.length; i++) { dout.writeUTF(ifaceNames[i]); } }
|
根據 field name 排序,然後把 name, modifier, signature 寫入:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| Field[] fields = cl.getDeclaredFields(); MemberSignature[] fieldSigs = new MemberSignature[fields.length]; for (int i = 0; i < fields.length; i++) { fieldSigs[i] = new MemberSignature(fields[i]); } Arrays.sort(fieldSigs, new Comparator<MemberSignature>() { public int compare(MemberSignature ms1, MemberSignature ms2) { return ms1.name.compareTo(ms2.name); } }); for (int i = 0; i < fieldSigs.length; i++) { MemberSignature sig = fieldSigs[i]; int mods = sig.member.getModifiers() & (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED | Modifier.STATIC | Modifier.FINAL | Modifier.VOLATILE | Modifier.TRANSIENT); if (((mods & Modifier.PRIVATE) == 0) || ((mods & (Modifier.STATIC | Modifier.TRANSIENT)) == 0)) { dout.writeUTF(sig.name); dout.writeInt(mods); dout.writeUTF(sig.signature); } }
|
這邊可以注意到,如果 modifier 是 PRIVATE
或是 STATIC
和 TRANSIENT
就不寫入
所以在 java 序列化時,只要變數前加上 transient
關鍵字,就不會對這個變數做序列化
繼續往下看
當存在 Static Initializer 時,會將這段寫入:
1 2 3 4 5
| if (hasStaticInitializer(cl)) { dout.writeUTF("<clinit>"); dout.writeInt(Modifier.STATIC); dout.writeUTF("()V"); }
|
(註: Static Initializer 的功能在於初始化類別,當類被載入至 JVM 時,會執行寫在 Static Block 裡的程式碼)
根據 signature 排序,然後將非 private 的 constuctor 寫入:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| Constructor<?>[] cons = cl.getDeclaredConstructors(); MemberSignature[] consSigs = new MemberSignature[cons.length]; for (int i = 0; i < cons.length; i++) { consSigs[i] = new MemberSignature(cons[i]); } Arrays.sort(consSigs, new Comparator<MemberSignature>() { public int compare(MemberSignature ms1, MemberSignature ms2) { return ms1.signature.compareTo(ms2.signature); } }); for (int i = 0; i < consSigs.length; i++) { MemberSignature sig = consSigs[i]; int mods = sig.member.getModifiers() & (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED | Modifier.STATIC | Modifier.FINAL | Modifier.SYNCHRONIZED | Modifier.NATIVE | Modifier.ABSTRACT | Modifier.STRICT); if ((mods & Modifier.PRIVATE) == 0) { dout.writeUTF("<init>"); dout.writeInt(mods); dout.writeUTF(sig.signature.replace('/', '.')); } }
|
照 method name 和 signature 排序,然後寫入非 private method 的 name, modifier, signature:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| MemberSignature[] methSigs = new MemberSignature[methods.length]; for (int i = 0; i < methods.length; i++) { methSigs[i] = new MemberSignature(methods[i]); } Arrays.sort(methSigs, new Comparator<MemberSignature>() { public int compare(MemberSignature ms1, MemberSignature ms2) { int comp = ms1.name.compareTo(ms2.name); if (comp == 0) { comp = ms1.signature.compareTo(ms2.signature); } return comp; } }); for (int i = 0; i < methSigs.length; i++) { MemberSignature sig = methSigs[i]; int mods = sig.member.getModifiers() & (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED | Modifier.STATIC | Modifier.FINAL | Modifier.SYNCHRONIZED | Modifier.NATIVE | Modifier.ABSTRACT | Modifier.STRICT); if ((mods & Modifier.PRIVATE) == 0) { dout.writeUTF(sig.name); dout.writeInt(mods); dout.writeUTF(sig.signature.replace('/', '.')); } } dout.flush();
|
最後把 bout
拿去做 SHA1,取前 8 個 Bytes 當作 suid 回傳
1 2 3 4 5 6 7
| MessageDigest md = MessageDigest.getInstance("SHA"); byte[] hashBytes = md.digest(bout.toByteArray()); long hash = 0; for (int i = Math.min(hashBytes.length, 8) - 1; i >= 0; i--) { hash = (hash << 8) | (hashBytes[i] & 0xFF); } return hash;
|
所以我們現在知道,並不是所有類別更改都會影響到 suid
好了,扯遠了,繼續回來看 readNonProxy()
所以 readNonProxy()
初始化完類別名字、suid 之後,readClassDescriptor()
就會把這個初始化的 class descriptor 回傳回去
接著回到 readNonProxyDesc()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| private ObjectStreamClass readNonProxyDesc(boolean unshared) throws IOException { ... Class<?> cl = null; ClassNotFoundException resolveEx = null; bin.setBlockDataMode(true); final boolean checksRequired = isCustomSubclass(); try { if ((cl = resolveClass(readDesc)) == null) { resolveEx = new ClassNotFoundException("null class"); } else if (checksRequired) { ReflectUtil.checkPackageAccess(cl); } } catch (ClassNotFoundException ex) { resolveEx = ex; } filterCheck(cl, -1); ...
|
剛剛初始化完的 class descriptor readDesc
被丟進 resovleClass()
而 resolveClass()
做的事情很單純,透過反射,取得並回傳當前 descriptor 描述的類別物件,也就是對應到我們這個例子的 Kaibro
反射機制:
Java 是個靜態語言,不像 PHP 有那麼多靈活的動態特性,但透過反射機制,可以大幅提升 Java 的動態性
核心概念是,它運行時才動態載入或調用、訪問方法和屬性,不需事先定義目標是誰
例如,你的程式沒有 import 某個類別,可以透過反射來動態載入: Class<?> cls = Class.forName("java.lang.Runtime");
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| protected Class<?> resolveClass(ObjectStreamClass desc) throws IOException, ClassNotFoundException { String name = desc.getName(); try { return Class.forName(name, false, latestUserDefinedLoader()); } catch (ClassNotFoundException ex) { Class<?> cl = primClasses.get(name); if (cl != null) { return cl; } else { throw ex; } } }
|
接著呼叫 filterCheck(cl, -1)
,這裡的 cl
就是我們剛才 reovleClass
的類別物件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
| * Invoke the serialization filter if non-null. * If the filter rejects or an exception is thrown, throws InvalidClassException. * * @param clazz the class; may be null * @param arrayLength the array length requested; use {@code -1} if not creating an array * @throws InvalidClassException if it rejected by the filter or * a {@link RuntimeException} is thrown */ private void filterCheck(Class<?> clazz, int arrayLength) throws InvalidClassException { if (serialFilter != null) { RuntimeException ex = null; ObjectInputFilter.Status status; long bytesRead = (bin == null) ? 0 : bin.getBytesRead(); try { status = serialFilter.checkInput(new FilterValues(clazz, arrayLength, totalObjectRefs, depth, bytesRead)); } catch (RuntimeException e) { status = ObjectInputFilter.Status.REJECTED; ex = e; } if (status == null || status == ObjectInputFilter.Status.REJECTED) { if (Logging.infoLogger != null) { Logging.infoLogger.info( "ObjectInputFilter {0}: {1}, array length: {2}, nRefs: {3}, depth: {4}, bytes: {5}, ex: {6}", status, clazz, arrayLength, totalObjectRefs, depth, bytesRead, Objects.toString(ex, "n/a")); } InvalidClassException ice = new InvalidClassException("filter status: " + status); ice.initCause(ex); throw ice; } else { if (Logging.traceLogger != null) { Logging.traceLogger.finer( "ObjectInputFilter {0}: {1}, array length: {2}, nRefs: {3}, depth: {4}, bytes: {5}, ex: {6}", status, clazz, arrayLength, totalObjectRefs, depth, bytesRead, Objects.toString(ex, "n/a")); } } } }
|
這裡可以看到 serialFilter
是在 ObjectInputStream 初始化時取得的
當 serialFilter
存在時,filtercheck 會去做檢查、過濾,如果沒通過就直接拋出 Exception
serialFilter = ObjectInputFilter.Config.getSerialFilter();
這個其實就是大名鼎鼎的 JEP290 防禦機制
繼續回來看 readNonProxyDesc()
後半部分:
1 2 3 4 5 6 7 8 9 10 11
| private ObjectStreamClass readNonProxyDesc(boolean unshared) throws IOException { ... desc.initNonProxy(readDesc, cl, resolveEx, readClassDesc(false)); handles.finish(descHandle); passHandle = descHandle; return desc; }
|
這裡我們跟進去看 initNonProxy()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
| * Initializes class descriptor representing a non-proxy class. */ void initNonProxy(ObjectStreamClass model, Class<?> cl, ClassNotFoundException resolveEx, ObjectStreamClass superDesc) throws InvalidClassException { this.cl = cl; this.resolveEx = resolveEx; this.superDesc = superDesc; name = model.name; suid = Long.valueOf(model.getSerialVersionUID()); isProxy = false; isEnum = model.isEnum; serializable = model.serializable; externalizable = model.externalizable; hasBlockExternalData = model.hasBlockExternalData; hasWriteObjectData = model.hasWriteObjectData; fields = model.fields; primDataSize = model.primDataSize; numObjFields = model.numObjFields; if (cl != null) { localDesc = lookup(cl, true); ... cons = localDesc.cons; writeObjectMethod = localDesc.writeObjectMethod; readObjectMethod = localDesc.readObjectMethod; readObjectNoDataMethod = localDesc.readObjectNoDataMethod; writeReplaceMethod = localDesc.writeReplaceMethod; readResolveMethod = localDesc.readResolveMethod; if (deserializeEx == null) { deserializeEx = localDesc.deserializeEx; } } fieldRefl = getReflector(fields, localDesc); fields = fieldRefl.getFields(); }
|
這個方法做了很多初始化操作
包括前面講的 suid 檢查、計算等,在這個方法中都有處理到
這裡要稍微注意,參數 model
是我們剛剛從序列化 Stream 中,讀出來的 readDesc
,而目前 initNonProxy
這個方法是由我們前面剛建立的 desc
呼叫的
這個方法會使用 readDesc
(反序列化還原出來的) 屬性來初始化 desc
,所以必須先檢查 readDesc
正確性
為了檢查 readDesc
正確性,它會判斷跟本地直接 new 出來的物件 localDesc
的 suid, class name 等內容是否相同,若不同則拋出 Exception
其中 localDesc = lookup(cl, true)
是根據 class,返回對應的 class descriptor:
1 2 3 4 5 6 7 8 9 10 11 12 13
| static ObjectStreamClass lookup(Class<?> cl, boolean all) { ... if (entry == null) { try { entry = new ObjectStreamClass(cl); } catch (Throwable th) { entry = th; } ... } if (entry instanceof ObjectStreamClass) { return (ObjectStreamClass) entry; ...
|
可以看到它建立了一個新的 ObjectStreamClass
物件
來看一下 ObjectStreamClass
的 constructor:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
| * Creates local class descriptor representing given class. */ private ObjectStreamClass(final Class<?> cl) { this.cl = cl; name = cl.getName(); isProxy = Proxy.isProxyClass(cl); isEnum = Enum.class.isAssignableFrom(cl); serializable = Serializable.class.isAssignableFrom(cl); externalizable = Externalizable.class.isAssignableFrom(cl); Class<?> superCl = cl.getSuperclass(); superDesc = (superCl != null) ? lookup(superCl, false) : null; localDesc = this; if (serializable) { AccessController.doPrivileged(new PrivilegedAction<Void>() { public Void run() { if (isEnum) { suid = Long.valueOf(0); fields = NO_FIELDS; return null; } if (cl.isArray()) { fields = NO_FIELDS; return null; } suid = getDeclaredSUID(cl); try { fields = getSerialFields(cl); computeFieldOffsets(); } catch (InvalidClassException e) { serializeEx = deserializeEx = new ExceptionInfo(e.classname, e.getMessage()); fields = NO_FIELDS; } if (externalizable) { cons = getExternalizableConstructor(cl); } else { cons = getSerializableConstructor(cl); writeObjectMethod = getPrivateMethod(cl, "writeObject", new Class<?>[] { ObjectOutputStream.class }, Void.TYPE); readObjectMethod = getPrivateMethod(cl, "readObject", new Class<?>[] { ObjectInputStream.class }, Void.TYPE); readObjectNoDataMethod = getPrivateMethod( cl, "readObjectNoData", null, Void.TYPE); hasWriteObjectData = (writeObjectMethod != null); } writeReplaceMethod = getInheritableMethod( cl, "writeReplace", null, Object.class); readResolveMethod = getInheritableMethod( cl, "readResolve", null, Object.class); return null; } }); ...
|
這裡的 cons
為 cl
對應的 constructor
而後面的 writeObjectMethod
, readObjectMethod
, readObjectNoDataMethod
都是透過 getPrivateMethod()
反射取得的方法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| * Returns non-static private method with given signature defined by given * class, or null if none found. Access checks are disabled on the * returned method (if any). */ private static Method getPrivateMethod(Class<?> cl, String name, Class<?>[] argTypes, Class<?> returnType) { try { Method meth = cl.getDeclaredMethod(name, argTypes); meth.setAccessible(true); int mods = meth.getModifiers(); return ((meth.getReturnType() == returnType) && ((mods & Modifier.STATIC) == 0) && ((mods & Modifier.PRIVATE) != 0)) ? meth : null; } catch (NoSuchMethodException ex) { return null; } }
|
然後回到剛剛的initNonProxy()
:
1 2 3 4 5 6 7 8
| localDesc = lookup(cl, true); ... cons = localDesc.cons; writeObjectMethod = localDesc.writeObjectMethod; readObjectMethod = localDesc.readObjectMethod; readObjectNoDataMethod = localDesc.readObjectNoDataMethod; writeReplaceMethod = localDesc.writeReplaceMethod; ...
|
我們前面建立的 ObjectStreamClass
物件,就是這裡的 localDesc
它把 localDesc
中的 Constructor
, writeObjectMethod
, readObjectNoDataMethod
, writeReplaceMethod
都賦值到當前物件屬性上
也就是再更前面的 readNonProxyDesc()
中的 desc
物件
所以目前 desc
物件已經初始化完成,裡頭有我們剛剛反射出來的 Constuctor
, readObjectNoDataMethod
等屬性
接著就把這個物件返回給 readClassDesc()
的 descriptor
之後過一個 validator 檢查:
1 2 3
| if (descriptor != null) { validateDescriptor(descriptor); }
|
檢查通過之後,就 return 回最開頭的 readOrdinaryObject()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| private Object readOrdinaryObject(boolean unshared) throws IOException { ... ObjectStreamClass desc = readClassDesc(false); ... Object obj; try { obj = desc.isInstantiable() ? desc.newInstance() : null; } catch (Exception ex) { throw (IOException) new InvalidClassException( desc.forClass().getName(), "unable to create instance").initCause(ex); } ... if (desc.isExternalizable()) { readExternalData((Externalizable) obj, desc); } else { readSerialData(obj, desc); }
|
可以看到這裡呼叫 desc.newInstance()
做實例化,其實背後就是透過我們剛才得到的 Constructor 去生成物件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| Object newInstance() throws InstantiationException, InvocationTargetException, UnsupportedOperationException { if (cons != null) { try { return cons.newInstance(); } catch (IllegalAccessException ex) { throw new InternalError(ex); } } else { throw new UnsupportedOperationException(); } }
|
接著,當 desc
不是 Externalizable
時會呼叫 readSerialData(obj, desc)
繼續跟下去:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
| * Reads (or attempts to skip, if obj is null or is tagged with a * ClassNotFoundException) instance data for each serializable class of * object in stream, from superclass to subclass. Expects that passHandle * is set to obj's handle before this method is called. */ private void readSerialData(Object obj, ObjectStreamClass desc) throws IOException { ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout(); for (int i = 0; i < slots.length; i++) { ObjectStreamClass slotDesc = slots[i].desc; if (slots[i].hasData) { if (obj == null || handles.lookupException(passHandle) != null) { defaultReadFields(null, slotDesc); } else if (slotDesc.hasReadObjectMethod()) { ThreadDeath t = null; boolean reset = false; SerialCallbackContext oldContext = curContext; if (oldContext != null) oldContext.check(); try { curContext = new SerialCallbackContext(obj, slotDesc); bin.setBlockDataMode(true); slotDesc.invokeReadObject(obj, this); } catch (ClassNotFoundException ex) { * In most cases, the handle table has already * propagated a CNFException to passHandle at this * point; this mark call is included to address cases * where the custom readObject method has cons'ed and * thrown a new CNFException of its own. */ handles.markException(passHandle, ex); } finally { curContext.setUsed(); curContext = oldContext; } * defaultDataEnd may have been set indirectly by custom * readObject() method when calling defaultReadObject() or * readFields(); clear it to restore normal read behavior. */ defaultDataEnd = false; } else { defaultReadFields(obj, slotDesc); } ...
|
如果我們有自己重寫 readObject
,則呼叫 slotDesc.invokeReadObject(obj, this)
若沒有,則呼叫 defaultReadFields
填充數據
invokeReadObject()
實際上就是去呼叫我們重寫的 readObject:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| * Invokes the readObject method of the represented serializable class. * Throws UnsupportedOperationException if this class descriptor is not * associated with a class, or if the class is externalizable, * non-serializable or does not define readObject. */ void invokeReadObject(Object obj, ObjectInputStream in) throws ClassNotFoundException, IOException, UnsupportedOperationException { if (readObjectMethod != null) { try { readObjectMethod.invoke(obj, new Object[]{ in }); } catch (InvocationTargetException ex) { Throwable th = ex.getTargetException(); if (th instanceof ClassNotFoundException) { throw (ClassNotFoundException) th; } else if (th instanceof IOException) { throw (IOException) th; } else { throwMiscException(th); } } catch (IllegalAccessException ex) { throw new InternalError(ex); } } else { throw new UnsupportedOperationException(); } }
|
接著可以看到 readObjectMethod.invoke(obj, new Object[]{ in })
這裡的 readObjectMethod
就是我們前面透過反射設定的 readObject 方法,也就是 Kaibro.readObject
所以到目前為止,終於追到我們一開始的目標了!
從 ObjectInputStream.readObject()
一路追到這裡我們自己重寫的 Kaibro.readObject()
打完收工!
最後再小補充一下,一般我們在重寫的 readObject()
中,會去呼叫 ObjectInputStream.defaultReadObject()
它的作用是會去讀出 non-static 和 non-transient 的 field 出來
例如 Kaibro
這個例子裡,我在 readObject()
中,第一行呼叫了 in.defaultReadObject()
追一下這個方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| * Read the non-static and non-transient fields of the current class from * this stream. This may only be called from the readObject method of the * class being deserialized. It will throw the NotActiveException if it is * called otherwise. * * @throws ClassNotFoundException if the class of a serialized object * could not be found. * @throws IOException if an I/O error occurs. * @throws NotActiveException if the stream is not currently reading * objects. */ public void defaultReadObject() throws IOException, ClassNotFoundException { SerialCallbackContext ctx = curContext; if (ctx == null) { throw new NotActiveException("not in call to readObject"); } Object curObj = ctx.getObj(); ObjectStreamClass curDesc = ctx.getDesc(); bin.setBlockDataMode(false); defaultReadFields(curObj, curDesc); bin.setBlockDataMode(true); if (!curDesc.hasWriteObjectData()) { * Fix for 4360508: since stream does not contain terminating * TC_ENDBLOCKDATA tag, set flag so that reading code elsewhere * knows to simulate end-of-custom-data behavior. */ defaultDataEnd = true; } ClassNotFoundException ex = handles.lookupException(passHandle); if (ex != null) { throw ex; } }
|
可以看到實際上這個方法,背後其實也會呼叫 defaultReadFields(curObj, curDesc)
去填充物件的 field
所以如果我們把 defaultReadObject()
拔掉,那我們物件的 field 就沒辦法正常還原
一樣以我們的 Kaibro
class 為例,如果把 in.defaultReadObject()
拿掉
最後反序列化時,System.out.println(tmp.gg)
的結果就會是 null
總結
這篇文章中,我們是用實作 Serializable
的 Kaibro
class 當作例子去追
並未深入去追使用 Externalizable
的例子
但其實流程都大同小異,有興趣的讀者可以自己追一下
Externalizable:
該接口 extends Serializable 接口,並新增兩種方法: writeExternal 和 readExternal
這兩個方法會在序列化和反序列化過程中被調用
由於這篇是為了追自定義 readObject
的呼叫時機
所以未對 Java 序列化格式與讀取方式做細部分析
對這方面有興趣的讀者可以去看 Java Serialization Protocol 的 spec:
https://docs.oracle.com/javase/8/docs/platform/serialization/spec/protocol.html
最後,簡化一下整篇的執行流程:
ObjectInputSteram.readObject()
readObject0()
readOrdinaryObject()
desc = readClassDesc(false)
descriptor = readNonProxyDesc(unshared)
readDesc = readClassDescriptor()
cl = resolveClass(readDesc)
filterCheck(cl, -1)
desc.initNonProxy(readDesc, cl, resolveEx, readClassDesc(false))
return desc
return descriptor
obj = desc.isInstantiable() ? desc.newInstance() : null
readSerialData(obj, desc)
slotDesc.invokeReadObject(obj, this)
readObjectMethod.invoke(obj, new Object[]{ in })
因為這篇是用空閒時間隨意寫的,如果有哪邊寫錯或寫不清楚,歡迎留言指教!