An idea to modify the kernel non-paged write-protected memory without CR0 or MDL (x64)

Category:

The idea of this article is to map a continuous virtual address to a new address by filling the page table entry, and at the same time map the read-only memory to be modified to the Dirty position of the page table entry. In the Windows operating system, write protection is achieved by protecting a specific virtual address. If a new mapping is not established, even if Dirty is set, an attempt to write to read-only memory will still trigger BugCheck. If a new mapping is established but Dirty is not set To trigger the BugCheck of PAGE_FAULT, two steps are indispensable.

To fill page table entries, PTEBase needs to be dynamically located first. The international practice is to use the page table self-mapping method. The code is as follows:

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ULONG_PTR PTEBase = 0;
BOOLEAN hzqstGetPTEBase()
{
    BOOLEAN Result = FALSE;
    ULONG_PTR PXEPA = __readcr3() & 0xFFFFFFFFF000;
    PHYSICAL_ADDRESS PXEPAParam;
    PXEPAParam.QuadPart = (LONGLONG)PXEPA;
    ULONG_PTR PXEVA = (ULONG_PTR)MmGetVirtualForPhysical(PXEPAParam);
    if (PXEVA)
    {
        ULONG_PTR PXEOffset = 0;
        do
        {
            if ((*(PULONGLONG)(PXEVA + PXEOffset) & 0xFFFFFFFFF000) == PXEPA)
            {
                PTEBase = (PXEOffset + 0xFFFF000) << 36;
                Result = TRUE;
                break;
            }
            PXEOffset += 8;
        while (PXEOffset < PAGE_SIZE);
    }
    return Result;
}

Here I also give another acquisition plan by the way, the basic idea is to export the function through NT

1
NTKERNELAPI ULONG NTAPI KeCapturePersistentThreadState(PCONTEXT Context, 
PKTHREAD Thread, ULONG BugCheckCode, ULONG_PTR BugCheckParameter1, 
ULONG_PTR BugCheckParameter2, ULONG_PTR BugCheckParameter3, 
ULONG_PTR BugCheckParameter4, PVOID VirtualAddress);

The 0x40000 size data under Dump contains MmPfnDataBase. MmPfnDataBase is an array of memory information indexed by the frame number of the physical address page. Among them is the address PteAddress of the PTE item corresponding to the physical page. We pass a valid physical address into it. (Such as the current CR3: __readcr3() & 0xFFFFFFFFF000) Take out the PteAddress, because PTEBase must be a multiple of 0x8000000000, so PteAddress can be directly calculated from PteBase. In addition, starting from the Win10 RS1 anniversary preview, KeCapturePersistentThreadState is controlled by the global variable ForceDumpDisabled. If the relevant subkeys in the registry “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl” meet the conditions, this variable will be set to 1 at startup , Causing the KeCapturePersistentThreadState call to fail. In summary, we get the second code for obtaining PTEBase as follows:

//If you test on Win10, you need to add "#define _WIN10 1" here
#ifdef _WIN10
#define OFFSET_PTEADDRESS 0x8
#elif
#define OFFSET_PTEADDRESS 0x10
#endif
ULONG_PTR PTEBase = 0;
PBOOLEAN LocateForceDumpDisabledInRange(ULONG_PTR StartAddress, 
ULONG MaximumBytesToSearch)
{
    PBOOLEAN Result = 0;
    ULONG_PTR p = StartAddress;
    ULONG_PTR pEnd = p + MaximumBytesToSearch;
    do
    {
        //cmp cs:ForceDumpDisabled, al
        //jnz ...
        if ((*(PULONGLONG)p & 0xFFFF00000000FFFF) == 0x850F000000000538)
        {
            Result = p + 6 + *(PLONG)(p + 2);
            break;
        }
        p++;
    while (p < pEnd);
    return Result;
}
BOOLEAN GetPTEBase()
{
    BOOLEAN Result = FALSE;
    CONTEXT Context = { 0 };
    Context.Rcx = (ULONG64)&Context;
    PUCHAR DumpData = ExAllocatePool(NonPagedPool, 0x40000);
    if (DumpData)
    {
        PBOOLEAN pForceDumpDisabled = LocateForceDumpDisabledInRange
((ULONG_PTR)KeCapturePersistentThreadState, 0x300);
        if (pForceDumpDisabled) *pForceDumpDisabled = FALSE;
        if (KeCapturePersistentThreadState(&Context, 0, 0, 0, 0, 0, 0, DumpData) == 0x40000)
        {
            ULONG_PTR MmPfnDataBase = *(PULONG_PTR)(DumpData + 0x18);
            PTEBase = *(PULONG_PTR)(MmPfnDataBase + OFFSET_PTEADDRESS + (((ULONG_PTR)(__readcr3() & 0xFFFFFFFFF000) >> 8) * 3)) & 0xFFFFFF8000000000;
            Result = TRUE;
        }
        ExFreePool(DumpData);
    }
    return Result;
}

 

After obtaining PTEBase, now enter the topic, the main idea of ​​the following code is: first start from a valid 512G-aligned kernel virtual address, until 0xFFFFFFFFFFFFFFFF, find the unoccupied PML4T (hereinafter collectively referred to as PXE) sub-items, namely The Valid bit of PXE is a sub-item of 0.

For the effective starting kernel virtual address, I chose MmSystemRangeStart at the beginning. The virtual machine test found that the mapping was successful for 8.1/10, but the CPU under Vista/7/8 did not recognize the validity of the mapped address and triggered the BugCheck, debugger When it is found that MmSystemRangeStart=0xFFFF080000000000 under Vista/7/8, MmSystemRangeStart=0xFFFF800000000000 under 8.1/10, and the mapped address range under Vista/7/8 is [0xFFFF080000000000, 0xFFFF800000000000), the debugger CrashDump prompts Noncanonical Virtual Address. After consulting the Intel manual, the author found that the maximum number of bits for virtual address addressing supported by the current Intel CPU is limited to 48 bits. For 64-bit Windows, the 47th bit is used to distinguish between user layer virtual addresses and kernel layer virtual addresses. That is, the kernel layer address actually has only 47 effective bits, so the effective starting kernel virtual address is 0xFFFF800000000000. Of course, for the sake of rigor, you can use the 0x80000008 function of the CPUID. At this time, the ah of the eax register is the maximum effective number of digits of the virtual address supported by the processor. Set it to x, then for the 64-bit address, as long as the highest (65-x ) Bits are all set to 1, and the remaining (x-1) bits are all set to 0, that is, the effective starting kernel virtual address is obtained.

After finding the unused PXE sub-item, apply for a continuous physical memory. The initial size is the page size described by the PXE sub-item and 512 PPE items of the PXE sub-item, namely (1 + 0x200) * PAGE_SIZE. If the application fails, Then halve the applied PPE item, and so on… Because it is continuous physical memory, the best solution is to apply through MmAllocateContiguousMemory. If ExAllocatePool is used, when the requested page is not a 2M large page, its virtual address corresponds to The physical addresses are probably not continuous, which will add a lot of trouble to the physical page frame numbers of 512 PPE items. After applying for a continuous physical address, the first page is filled into the target PXE sub-item, and the physical page frame numbers of the second to 513 pages are sequentially filled into the 512 PPE items described in the PXE sub-item page. Then, given any virtual address that needs to be mapped, we first align it to 0x8000000000 (512G), and then retrieve its PXE, PPE, and PDE items in turn. If the PXE item Valid is 0 or LargePage is 1, then it will not be mapped, otherwise it will start Search PPE and PDE in turn. If the PPE item Valid is 0, the PPE page corresponding to the mapped address is cleared. If Valid is 1, it will handle the LargePage or not. If it is a 1G large page, divide it into 512 2M large pages, and then fill the corresponding physical page frame numbers into the 512 PDE items described by the PPE in order; If it is not a large page, copy all the page corresponding to the PPE item of the mapped address to the PPE page corresponding to the mapped address. From here, the basic address mapping has been completed, and then for the page to be modified, just set the Dirty position of the corresponding PTE or large page PDE or large page PPE item to 1.

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
ULONG_PTR AllocateSinglePXEDirectory(OUT PULONG_PTR BaseAddress, 
OUT PULONG SizeOfPPEPages)
{
    ULONG_PTR Result = 0;
    ULONG_PTR PteBase = PTEBase;
    ULONG_PTR OffsetMask = 0x7FFFFFFFF8;
    ULONG_PTR PXEVA = PteBase + (((PteBase + (((PteBase + ((PteBase >> 9) & OffsetMask)) >> 9) 
& OffsetMask)) >> 9) & OffsetMask) + 0x800; //有效起始PXE sub-item corresponding to kernel virtual address 0xFFFF800000000000
    ULONG_PTR PXEVAEnd = PXEVA + 0x800; //0x800 + 0x800 == 0x1000 == PAGE_SIZE
    do
    {
        if (!(*(PULONGLONG)PXEVA & 0xFFFFFFFFF001))
        {
            PHYSICAL_ADDRESS Alloc;
            Alloc.QuadPart = MAXULONG64;
            ULONG TotalSizeOfValidPPEPages = 0x200 * PAGE_SIZE;
            while (TotalSizeOfValidPPEPages >= PAGE_SIZE && !(Result = 
(ULONG_PTR)MmAllocateContiguousMemory(TotalSizeOfValidPPEPages +
 PAGE_SIZE, Alloc)))
                TotalSizeOfValidPPEPages >>= 1;
            if (Result)
            {
                if (SizeOfPPEPages) *SizeOfPPEPages = TotalSizeOfValidPPEPages;
                ULONG64 OringinalIRQL = __readcr8();
                __writecr8(DISPATCH_LEVEL);
                if (BaseAddress) *BaseAddress = ((PXEVA & 0xFF8) + 0xFFFF000) << 36;
                ULONG_PTR PTEVA = PteBase + ((Result >> 9) & OffsetMask);
                ULONG_PTR PDEVA = PteBase + ((PTEVA >> 9) & OffsetMask);
                ULONG_PTR PPEVA = PteBase + ((PDEVA >> 9) & OffsetMask);
                ULONGLONG StartValue = *(PULONGLONG)PPEVA;
                if (StartValue & 0x80)
                {
                    StartValue &= ~0x80;
                    StartValue += (Result & 0x3FFFF000);
                }
                else
                {
                    StartValue = *(PULONGLONG)PDEVA;
                    if (StartValue & 0x80)
                    {
                        StartValue &= ~0x80;
                        StartValue += (Result & 0x1FF000);
                    }
                    else StartValue = *(PULONGLONG)PTEVA;
                }
                *(PULONGLONG)PXEVA = StartValue;
                ULONG PPEOffset = 0;
                ULONG PPEOffsetEnd = TotalSizeOfValidPPEPages >> 9;
                ULONG_PTR PPEBase = Result;
                do
                {
                    *(PULONGLONG)(PPEBase + PPEOffset) = StartValue +
 ((PPEOffset + 8) << 9);
                    PPEOffset += 8;
                while (PPEOffset < PPEOffsetEnd);
                RtlZeroMemory((PVOID)(PPEBase + PPEOffset), PAGE_SIZE - PPEOffset);
                __writecr8(OringinalIRQL);
            }
            break;
        }
        PXEVA += 8;
    while (PXEVA < PXEVAEnd);
    return Result;
}
ULONG_PTR FillPDEArrayForAllValidPPEs(ULONG_PTR PagePointer, ULONG_PTR BaseAddress, 
ULONG SizeOfPPEPages, ULONG_PTR VirtualAddress)
{
    ULONG_PTR Result = 0;
    ULONG_PTR PteBase = PTEBase;
    ULONG_PTR OffsetMask = 0x7FFFFFFFF8;
    ULONG_PTR PDEVA = PteBase + (((PteBase +
 (((VirtualAddress & 0xFFFFFF8000000000) >> 9) & OffsetMask)) >> 9) & OffsetMask);
    ULONG_PTR PPEVA = PteBase + ((PDEVA >> 9) & OffsetMask);
    ULONG_PTR PXEVA = PteBase + ((PPEVA >> 9) & OffsetMask);
    if ((*(PUCHAR)PXEVA & 0x81) == 1) //Does not support 512G large pages
    {
        if (SizeOfPPEPages >= PAGE_SIZE)
        {
            ULONG_PTR NewPDEVA = PagePointer + PAGE_SIZE;
            ULONG_PTR NewPDEVAEnd = NewPDEVA + SizeOfPPEPages;
            ULONG64 OringinalIRQL = __readcr8();
            __writecr8(DISPATCH_LEVEL);
            do
            {
                UCHAR ByteFlag = *(PUCHAR)PPEVA;
                if (ByteFlag & 1)
                {
                    if ((CHAR)ByteFlag < 0) 
//In order not to change the previously filled PPE items,
the 1G large page is equally divided into 512 2M large pages
                    {
                        ULONGLONG PDEStartValue = *(PULONGLONG)PPEVA;
                        ULONG PDEOffset = 0;
                        do
                        {
                            *(PULONGLONG)(NewPDEVA + PDEOffset) = 
PDEStartValue + (PDEOffset << 9);
                            PDEOffset += 8;
                        while (PDEOffset < PAGE_SIZE);
                    }
                    else memcpy((PVOID)NewPDEVA, (PVOID)PDEVA, PAGE_SIZE);
                }
                else RtlZeroMemory((PVOID)NewPDEVA, PAGE_SIZE);              
                PDEVA += PAGE_SIZE;
                PPEVA += 8;
                NewPDEVA += PAGE_SIZE;
            while (NewPDEVA < NewPDEVAEnd);
            __writecr8(OringinalIRQL);
            __writecr3(__readcr3());
            Result = BaseAddress + (VirtualAddress & 0x7FFFFFFFFF);
        }
    }  
    return Result;
}
BOOLEAN MakeDirtyPage(ULONG_PTR VirtualAddress)
{
    BOOLEAN Result = FALSE;
    ULONG_PTR PteBase = PTEBase;
    ULONG_PTR OffsetMask = 0x7FFFFFFFF8;
    ULONG_PTR PTEVA = PteBase + ((VirtualAddress >> 9) & OffsetMask);
    ULONG_PTR PDEVA = PteBase + ((PTEVA >> 9) & OffsetMask);
    ULONG_PTR PPEVA = PteBase + ((PDEVA >> 9) & OffsetMask);
    ULONG_PTR PXEVA = PteBase + ((PPEVA >> 9) & OffsetMask);
    if ((*(PUCHAR)PXEVA & 0x81) == 1) //Does not support 512G large pages
    {
        UCHAR ByteFlag = *(PUCHAR)PPEVA;
        if (ByteFlag & 1)
        {
            if ((CHAR)ByteFlag < 0)
            {
                *(PUCHAR)PPEVA |= 0x42; //Dirty1 & Dirty
                Result = TRUE;
            }
            else
            {
                ByteFlag = *(PUCHAR)PDEVA;
                if (ByteFlag & 1)
                {
                    if ((CHAR)ByteFlag < 0)
                    {
                        *(PUCHAR)PDEVA |= 0x42;
                        Result = TRUE;
                    }
                    else
                    {
                        if (_bittest((PLONG)PTEVA, 0))
                        {
                            *(PUCHAR)PTEVA |= 0x42;
                            Result = TRUE;
                        }
                    }
                }
            }
        }
    }
    __invlpg((PVOID)VirtualAddress);
    return Result;
}
void FreeSinglePXEDirectory(ULONG_PTR PagePointer, ULONG_PTR BaseAddress)
{
    ULONG_PTR PteBase = PTEBase;
    ULONG_PTR OffsetMask = 0x7FFFFFFFF8;
    *(PULONGLONG)(PteBase + (((PteBase + (((PteBase + (((PteBase + ((BaseAddress >> 9) &
 OffsetMask)) >> 9) & OffsetMask)) >> 9) & OffsetMask)) >> 9) & OffsetMask)) = 0;
    MmFreeContiguousMemory((PVOID)PagePointer);
    __writecr3(__readcr3());
}

Test code and results on Win10 18362.207 x64:

NTKERNELAPI NTSTATUS ObReferenceObjectByName(IN PUNICODE_STRING ObjectName, IN ULONG Attributes, IN PACCESS_STATE PassedAccessState OPTIONAL, IN ACCESS_MASK DesiredAccess OPTIONAL, IN POBJECT_TYPE ObjectType, IN KPROCESSOR_MODE AccessMode, IN OUT PVOID ParseContext OPTIONAL, OUT PVOID *Object);
extern POBJECT_TYPE *IoDriverObjectType;
NTSTATUS DriverEntry(PDRIVER_OBJECT pDriverObj, PUNICODE_STRING pRegistryString)
{
    ULONG_PTR PagePointer = 0;
    ULONG_PTR BaseAddress = 0;
    ULONG SizeOfValidPPEPages = 0;
    if (GetPTEBase())
    {
        UNICODE_STRING UDrvName;
        PDRIVER_OBJECT DrvObj = 0;
        if (NT_SUCCESS(RtlInitUnicodeString(&UDrvName, L"\\Driver\\kbdclass")) && NT_SUCCESS(ObReferenceObjectByName(&UDrvName, OBJ_CASE_INSENSITIVE, 0, 0, *IoDriverObjectType, KernelMode, 0, (PVOID*)&DrvObj)))
        {
            ObDereferenceObject(DrvObj);
            PagePointer = AllocateSinglePXEDirectory(&BaseAddress, &SizeOfValidPPEPages);
            ULONG_PTR MappedKbdClassBase = FillPDEArrayForAllValidPPEs(PagePointer, BaseAddress, SizeOfValidPPEPages, (ULONG_PTR)DrvObj->DriverStart);
            if (MakeDirtyPage(MappedKbdClassBase)) *(PULONG)(MappedKbdClassBase + 4) = 0x78563412;
            FreeSinglePXEDirectory(PagePointer, BaseAddress);
        }
    }
    return STATUS_UNSUCCESSFUL;
}

Also attach Win10’s _MMPTE_HARDWARE for reference

typedef struct _MMPTE_HARDWARE
{
    ULONGLONG Valid : 1;
    ULONGLONG Dirty1 : 1;
    ULONGLONG Owner : 1;
    ULONGLONG WriteThrough : 1;
    ULONGLONG CacheDisable : 1;
    ULONGLONG Accessed : 1;
    ULONGLONG Dirty : 1;
    ULONGLONG LargePage : 1;
    ULONGLONG Global : 1;
    ULONGLONG CopyOnWrite : 1;
    ULONGLONG Unused : 1;
    ULONGLONG Write : 1;
    ULONGLONG PageFrameNumber : 36; //Vista SP0 is 28, Vista SP1--10 universal 36
    ULONGLONG ReservedForHardware : 4;
    ULONGLONG ReservedForSoftware : 4;
    ULONGLONG WsleAge : 4;
    ULONGLONG WsleProtection : 3;
    ULONGLONG NoExecute : 1;
} MMPTE_HARDWARE, *PMMPTE_HARDWARE;

To sum up, the code implemented above is probably similar to MmBuildMdlForNonPagedPool and MmMapLockedPagesSpecifyCache. The difference is that the code in this article is more equivalent to a snapshot of the physical memory distribution of a 512G virtual address space. Some of the page memory may be placed during the snapshot process. In the case of entering the physical memory and being distributed back to the hard disk after the snapshot, the use of MmIsAddressValid in the mapped address space is far less reliable than the use in the original address space.

 

Reviews

There are no reviews yet.

Be the first to review “An idea to modify the kernel non-paged write-protected memory without CR0 or MDL (x64)”

Your email address will not be published. Required fields are marked *