Traditional implementation methods of Flash wear leveling mainly base on file system and focus on Nand Flash, while the wear leveling of Nor Flash is ignored. Nor Flash sometimes fails to be embedded in the operating system, and the cost can be too huge, so wear leveling cannot be implemented through the file system. We implement Flash wear leveling on hardware to solve this problem and reduce software cost. Four modules, which are wear leveling, address mapping, garbage collection and Flash interface unit, are implemented by Verilog. When a write request arrives, the sector which has the minimum erase time is found by the heap-sort, the virtual address is connected to the sector's physical address, and the address mapping list is updated. When the number of garbage sectors reaches a threshold value, garbage collection starts. Finally, experimental results show that the operation time of initialization, heap deletion and read in hardware wear leveling algorithm is at most 14, 16.4 and 17.8 times faster than those of software algorithms respectively.