Front. Astron. Space Sci. Frontiers in Astronomy and Space Sciences Front. Astron. Space Sci. 2296-987X Frontiers Media S.A. 1407870 10.3389/fspas.2024.1407870 Astronomy and Space Sciences Original Research Design and implementation of a scalable correlator based on ROACH2 + GPU cluster for tianlai 96-dual-polarization antenna array Wang et al. 10.3389/fspas.2024.1407870 Wang Zhao 1 2 3 4 Li Ji-Xia 2 * Zhang Ke 2 3 Wu Feng-Quan 2 * Tian Hai-Jun 1 5 * Niu Chen-Hui 4 Zhang Ju-Yong 5 * Chen Zhi-Ping 5 * Yu Dong-Jin 5 Chen Xue-Lei 2 1 School of Science, Hangzhou Dianzi University, Hangzhou, China 2 National Astronomical Observatories, Chinese Academy of Sciences, Beijing, China 3 College of Electrical Engineering and New Energy, China Three Gorges University, Yichang, China 4 Central China Normal University, Wuhan, China 5 Big Data Institute, Hangzhou Dianzi University, Hangzhou, China

Edited by: Hairen Wang, Purple Mountain Observatory, Chinese Academy of Sciences (CAS), China

Reviewed by: Dan Werthimer, University of California, Berkeley, United States

Tao An, Shanghai Astronomical Observatory, Chinese Academy of Sciences (CAS), China

*Correspondence: Ji-Xia Li, jxli@bao.ac.cn; Feng-Quan Wu, wufq@bao.ac.cn; Hai-Jun Tian, hjtian@hdu.edu.cn; Ju-Yong Zhang, zhangjy@hdu.edu.cn; Zhi-Ping Chen, chen_zp@hdu.edu.cn
30 07 2024 2024 11 1407870 27 03 2024 25 06 2024 Copyright © 2024 Wang, Li, Zhang, Wu, Tian, Niu, Zhang, Chen, Yu and Chen. 2024 Wang, Li, Zhang, Wu, Tian, Niu, Zhang, Chen, Yu and Chen

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

The digital correlator is one of the most crucial data processing components of a radio telescope array. With the scale of radio interferometeric array growing, many efforts have been devoted to developing a cost-effective and scalable correlator in the field of radio astronomy. In this paper, a 192-input digital correlator with six CASPER ROACH2 boards and seven GPU servers has been deployed as the digital signal processing system for Tianlai cylinder pathfinder located in Hongliuxia observatory. The correlator consists of 192 input signals (96 dual-polarization), 125-MHz bandwidth, and full-Stokes output. The correlator inherits the advantages of the CASPER system, for example, low cost, high performance, modular scalability, and a heterogeneous computing architecture. With a rapidly deployable ROACH2 digital sampling system, a commercially expandable 10 Gigabit switching network system, and a flexible upgradable GPU computing system, the correlator forms a low-cost and easily-upgradable system, poised to support scalable large-scale interferometeric array in the future.

interferometer correlator FPGA signal processing correlation radio astronomy National Natural Science Foundation of China-China Academy of General Technology Joint Fund for Basic Research10.13039/501100019492 section-at-acceptance Astronomical Instrumentation

香京julia种子在线播放

    1. <form id=HxFbUHhlv><nobr id=HxFbUHhlv></nobr></form>
      <address id=HxFbUHhlv><nobr id=HxFbUHhlv><nobr id=HxFbUHhlv></nobr></nobr></address>

      1 Introduction

      The digital correlator plays a crucial role in radio astronomy by combining individual antennas to form a large-aperture antenna, keeping large field of view, and providing high-resolution images. At present, many radio interferometric arrays in the world use CASPER (Collaboration for Astronomy Signal Processing and Electronics Research) hardware platform ROACH2 (Reconfigurable Open Architecture Computing Hardware-2) to develop correlators. For example, PAPER (Precision Array for Probing the Epoch of Reionization) in South Africa’s Karoo Desert (Parsons et al., 2010; Ali et al., 2015). The 100 MHz FX correlator was originally based on iBOBs (Interconnect Break-out Boards) and later upgraded to ROACH, and then ROACH2 boards (Hickish et al., 2016). Currently, PAPER uses 8 ROACH2 boards for channelization, followed by a GPU (Graphics Processing Unit)-based ‘X’ stage. Additionally, the ‘large-N’ correlator located in the Owens Valley Radio Observatory (LWA-OV) is designed to enable the Large Aperture Experiment to Detect the Dark Ages (LEDA) (Kocz et al., 2015). It features a 58 MHz, 512-input digitization, channelization, and packetization system using a GPU correlator backend.

      The Tianlai project 1 is an experiment aimed at detecting dark energy by measuring baryon acoustic oscillation (BAO) features in the large-scale structure power spectrum, in which BAO can be used as a standard ruler (Ansari et al., 2012; Kovetz et al., 2019; Liu and Shaw, 2020). The basic plan is to build a radio telescope array and use it to make 21 cm intensity mapping observations of neutral hydrogen, which trace the large-scale structure of the matter distribution (Chen, 2011; Xu et al., 2015; Zhang et al., 2016; Das et al., 2018; Yu et al., 2024; Yu et al., 2023). Currently, two different types of pathfinder array have been built in a quiet radio site in Hongliuxia, Balikun county, Xinjiang, China (Wu et al., 2014). The cylinder array consists of three adjacent parabolic cylinder reflectors, each 40 m × 15 m, with their long axes oriented in the N-S direction. It has a total of 96 dual-polarization feeds, resulting in 192 signal channels (Cianciara et al., 2017; Li et al., 2020; Sun et al., 2022). The dish array includes 16 dishes with 6-m aperture. Each dish has a dual polarization feed, generating 32 signal channels in total (Wu et al., 2021; Perdereau et al., 2022; Kwak et al., 2024). In addition to the ability to survey the 21 cm Hydrogen sky, both antenna arrays are also capable of detecting fast radio bursts (Yu et al., 2022a; Yu et al., 2022b; Yu et al., 2022c). This paper is about the development of correlator for Tianlai cylinder pathfinder array.

      The design of the Tianlai cylinder correlator is based on the prototype correlator of Niu et al. (2019), which has 32 inputs and was used for the Tianlai Dish pathfinder array. This 32-input prototype correlator is built upon the model of PAPER correlator, which creates a flexible and scalable hybrid correlator system. We expanded the prototype correlator from 32 to 192 channels, reprogrammed the network transport model, increased it from a single GPU server to seven GPU servers, solved the synchronization problem of multiple devices. It was eventually deployed in the machine room on the Hongliuxia site. The primary motivation behind the design of the PAPER correlator architecture is the scalability for large-scale antenna arrays, and it has been executed exceptionally well. Therefore, we have chosen to borrow ideas from the PAPER correlator. The Tianlai project is expanding and the number of single inputs will soon increase to more than 500.

      The Tianlai cylinder correlator is a flexible, scalable, and efficient system, which has a hybrid structure of ROACH2+GPU+10 GbE network. A ROACH2 is an independent board, unlike a PCIe-sampling board which needs to be plugged into a computer server and often leads to some incompatible issues. A GPU card is dramatically upgrading, and it is almost the best choice among the current available hardwares, such as CPU/GPU/DSP (Digital Signal Processing), by comprehensively considering the flexibility, the efficiency and the cost. The module of the data switch network is easy to be upgraded, since the Ethernet switch has a variety of commercial applications. We have uploaded all the project files to Github. 2

      This paper gives a detailed introduction to the function and performance of the Tianlai 192-input cylinder correlator system. In Section 2, we introduce the general framework of the correlator system and show the deployment of the correlator. Then, in a sub-section, we provide a detailed introduction to the design and functions of each module. In Section 3, we evaluate the performance of the correlator. Section 4 summarizes the correlator system and presents the design scheme for correlator expansion in the future as part of the Tianlai project.

      2 System design

      The digital correlators can be classified into two types: XF and FX. XF correlators combine signals from multiple antennas and performs cross-correlation followed by Fourier transformation. XF correlators can handle a large number of frequency channels and have a relatively simple hardware design (Thompson et al., 2017). FX correlators combine signals from multiple antennas and perform the Fourier transformation followed by cross-correlation. FX correlators can handle a large number of antenna pairs and also have a relatively simple hardware design. The Tianlai cylinder correlator is an FX correlator.

      The Tianlai cylinder correlator system can be divided into four parts, as shown in Figure 1. The first part is the control part which consists of a master computer and an Ethernet switch. The Ethernet switch is used for net-booting of the ROACH2, monitoring the status of F-engine and hashpipe 3 , and synchronizing the running status of F-engine and X-engine.

      This block diagram illustrates the Tianlai cylinder array correlator. The master computer communicates with ROACH2 boards, a 10   GbE switch, and GPU servers through an Ethernet switch. Six ROACH2 boards receive 192 input signals from the antenna. After the signal processing is completed, the UDP(User Datagram Protocol) data is sent to the 10  GbE switch. Seven GPU servers receive the UDP data to calculate the cross-correlation, sending results back to the Ethernet switch after the computations are finished. Finally, the data is transmitted to the master computer for storage via the Ethernet switch.

      The second part is the F-engine, which consists of six ROACH2 boards and one 10 GbE switch. The 192 input signals from the Tianlai cylinder array are connected to the ADC connectors on the ROACH2 boards. The main functions of the F-engine are to Fourier transform the data from the time domain into the frequency domain, and transmit the data to the GPU server through a 10 GbE switch.

      The third part is the X-engine, which performs cross-correlation on the received Fourier data. Each GPU server receives packets from all ROACH2 boards. The details of network transmission will be explained later. The X-engine utilizes a software called hashpipe (MacMahon et al., 2018) to store, deliver and compute the cross-correlations.

      The fourth part is the data storage part, which consists of seven GPU servers, an Ethernet switch, and a storage server (shared with the master computer). The GPU servers transmit data to the storage server via an Ethernet switch. We have developed a multi-threading program to collect and organize data packets from different GPU servers, and finally save them onto hard drives in HDF5 format.

      The deployment of the correlator system is shown in Figure 2. It consists of six ROACH2 boards, an Ethernet switch, a 10 GbE switch, a master computer, and seven GPU servers, arranged from top to bottom. The yellow “ROACH2” label in Figure 2A represents the front panel of the ROACH2 board. (In our case, a connector transformer panel has been specifically designed to conveniently connect to the radio cables.) The ADC connector of the ROACH2 is connected to a blue RF (Radio Frequency) cable that transmits the analog signal. Figure 2B shows the back side of the ROACH2 board. On the far left is the power line. The light orange RF cable is the clock cable. The 250 MHz clock of the ROACH2 board is output by a VALON 5008 dual-frequency synthesizer module, and it is split by a 12-way power splitter. The short blue-black cable connects to the synchronization port between the ROACH2 boards. We use synchronization ports in F-engine functional block design to ensure that the six ROACH2 boards work at the same clock. The signal of the synchronization port is provided by a time server. The time server sends out a 1-PPS (Pulse Per Second) signal, which is used to initialize the synchronization module of the F-engine system. After running the F-engine control script, the 1 PPS signal drives the F-engine and synchronizes the operational state of the six ROACH2 boards. The bandwidth of the antenna signal input to the ROACH2 board is 125 MHz. According to the Nyquist sampling law, the input signal can be completely recovered by a 250 Msps sampling rate. In Figure 2C, seven GPU servers are vertically stacked, consisting of six Supermicro servers with a size of 4 U (Unit) and one Dell server with a size of 2U. These devices are used to implement X-engine functionality. The number of servers is determined by the total frequency channel count and the frequency channel processing capacity of each server. In terms of computational performance, each GPU server runs 4 hashpipe threads, processing a total of 128 frequency channels. At this configuration, the computational performance accounts for approximately 46% of the theoretical peak performance. In terms of data transfer performance, the server’s PCIe is of version 3.0, with a transfer rate close to 8 GB/s. This is comparable to the maximum transfer rate between the host and the device.

      (A) Six ROACH2 boards are connected to 192 input signals. The two switches are positioned on the same layer, with the Ethernet switch in the front and the 10   GbE switch at the rear. The master computer is located below. (B) Clock and synchronization cable connections on the rear panel of ROACH2 boards. (C) Seven GPU servers are vertically stacked, consisting of six Supermicro servers with a size of 4U and a Dell server with a size of 2U.

      2.1 F-engine

      The diagram of the F-engine module is shown in Figure 3. The Tianlai cylinder correlator system is an improvement upon the Tianlai dish correlator system, with enhancements including an increased number of input signals and additional new functions. The Tianlai dish correlator is very similar to the PAPER experiment correlator, which also uses the ROACH2 system. Please refer to (Niu et al., 2019) for details. Here, we provide a concise overview of the F-engine’s process and the functionality of the CASPER yellow block.

      Data flow block diagram of each F-engine.

      Each ROACH2 board is connected with two ADC boards through Z-DOK + connectors. The ADC board is the adc16 × 250-8 coax rev2 Q2 2013 version, which uses 4 HMCAD1511 chips and provides a total of 16 inputs. It samples 16 analog signal inputs with 8 bits at a rate of 250 Msps.

      The output digital signal of the adc16 × 250 block is Fix_8_7 format, which indicates an 8-bit number with 7 bits after the decimal point. The ADC chip is accompanied by a control program developed by David MacMahon from the University of California, Berkeley. This program is responsible for activating the ADC, selecting the amplification level, calibrating the FPGA input delay, aligning the FPGA SERDES blocks until data is correctly framed, and performing other related tasks. A comprehensive user’s guide for the ADC16 chip is accessible on the CASPER website 4 . According to the actual range of input signal power, we conducted linearity tests on the ADC at various gain coefficients and also assessed the linearity of the correlator system. Ultimately, the ADC gain coefficient was set to 2.

      The analog-to-digital converted data from the ADC is transmitted to the PFB (Polyphase Filter Bank) function module. PFB is a computationally efficient implementation of a filter bank, constructed by using an FFT (Fast Fourier Transform) preceded by a prototype polyphase FIR filter frontend (Price, 2021). The PFB not only ensures a relatively flat response across the channels but also provides excellent suppression of out-of-band signals. The PFB is implemented using the models pfb_fir and fft_biplex_real_2x from the CASPER module library.

      Each pfb_fir 5 block (the signal processing blocks mentioned in this article can all be linked to the detailed page from here) processes two signals, configured with parameters including a PFB size of 2 11 , a Hamming window function, four taps, input width of 8 bits, an output width of 18 bits, and other settings. Each block takes two input signals, and a total of 16 pfb_fir blocks are used to process 32 input signals. Each fft_biplex_real_2x block processes four input data streams and outputs two sets of frequency domain data. Configured with parameters including an FFT size of 2 11 , an input width of 18 bits, an output width of 36 bits, and other settings. The parameter settings are based on the scientific requirements of the Tianlai project, which calls for a signal resolution of less than or equal to 0.2 MHz. There are eight fft_biplex_real_2x blocks, with each block taking in four data streams and outputting two sets of frequency domain data. The PFB module is flexible, making it very easy to adjust the parameters according to one’s requirements, such as the FFT size, PFB size, and the number of taps in the CASPER block.

      The data output of the PFB module is 36 bits, which essentially represents a complex number with 18 bits for the real part and 18 bits for the imaginary part. Considering factors such as data transmission and hardware resources, the data is usually effectively truncated. In our case, we will truncate the complex number to have a 4-bit real part and a 4-bit imaginary part. Prior to quantizing to 4 bits, the PFB output values pass through a scaling (i.e., gain) stage. Each frequency channel of each input has its own scaling factor. The purpose of the scaling stage is to equalize the passband before quantization, so this stage is often referred to as EQ. The scaling factors are also known as EQ 6 coefficients and are stored in shared BRAMs.

      The quantized data cannot be sent directly to the X-engine. Before sending it, we divide the frequency band and sort the data in a format that facilitates the relevant calculations. This module is called Transpose, and it is divided into four submodules. Each submodule processes 1/4 of the frequency band, resulting in a total of 256 frequency channels. The number of submodules corresponds to the number of 10 GbE network interface controllers (NICs) on the ROACH2 board, with each NIC used to receive and send data from the output of a transpose submodule. This module performs the data transpose, also known as a “corner turn” to arrange the data in the desired sequence. Additionally, it is responsible for generating the packet headers, which consist of MCNT (master counter), Fid (F-engine id), and Xid (X-engine id). The current parameter configuration of the sub-module is tailored for scenarios with 256 inputs or fewer. However, David MacMahon, the researcher behind the PAPER correlator system, has included sufficient spare bits in the design, enabling the adjustment of model parameters based on specific input conditions and accommodating scalability and additional use cases.

      The data is already in a form that is easy for X-engine to compute, we want to send it to X-engine, so the data comes to the Ethernet module. It contains four sub-modules and receives data from four transpose sub-modules. Each submodule has a Ten_GbE_v2 block, where we can set the MAC address, IP address, destination port and other parameters using Python or Ruby script.

      2.2 Network

      The data of the F-engine module is sent out through the ROACH2 network port and transmitted to the network port of the target GPU server through the 10 GbE switch. The network transmission model of the correlator system is dependent on the bandwidth of a single frequency channel and the number of frequency channels calculated by the GPU server. The diagram of data transfer from F-engine to X-engine is shown in Figure 4.

      Diagram of data transmission between X-engine and F-engine. Each ROACH2 board has 4 10 GbE ports, and each port transmits data from 256 frequency channels. The 10   GbE switch is configured with 4 VLANs (Virtual Local Area Networks), which offers benefits in simplicity, security, traffic management, and economy. Each VLAN receives UDP data from 256 frequency channels and sends them to the GPU servers.

      The frequency domain data in F-engine has a total of 1,024 frequency channels. Given the 250 Msps sampling rate, each frequency channel has a width of Δ ν = 125/1,024 MHz = 122.07 kHz. Each GPU node processes 128 frequency channels with a bandwidth of 128 × 122.07 kHz = 15.625 MHz. The number of frequency channels processed by the GPU server is determined by hashpipe.

      The analog part of the Tianlai digital signal processing system uses replaceable bandpass filters, with the bandpass set to 700 MHz 800 MHz. We have chosen to utilize seven GPU nodes to implement the X-engine component. These seven GPU nodes process data for the central 896 frequency channels, covering a bandwidth of approximately 109.375 MHz from 692.8125 MHz to 802.1875 MHz, as shown in Figure 5. The final GPU node is dedicated to receiving data from the first 32 and last 32 frequency channels out of the 896 frequency channels.

      The relationship between the FFT channels and the radio frequency. There are a total of 1,024 frequency channels, and the input signal’s effective frequency range is 700-800 MHz. The correlator’s actual processing frequency range is 692.8125-802.1875 MHz, which includes a total of 896 frequency channels.

      The data transfer rate of a single network port of the ROACH2 board is 8.0152 Gbps, so the total data transfer rate of 6 ROACH2 boards is 6 × 4 ports × 8.0152 Gbps = 192.3648 Gbps. The data reception rate of a single network port on the GPU node is 192.3648 Gbps/ ( 8 × 4 ) = 6.0114 Gbps. At present, the number of input signals for the correlator system ranges from 32 to 256. While our correlator system is designed for 192 input signals, we conducted data transmission simulations with 256 input signals. Under these conditions, the data reception rate of a single network port on the GPU node stands at 8.0152 Gbps.

      In our system, each GPU server has four 10 GbE ports. For the Tianlai cylinder correlator system, we require a total of 6 ROACH2 boards × 4 ports + 7 GPU servers × 4 ports = 52 ports on a 10 GbE switch. So we selected the Mellanox SX1024 switch which has 48 ports of 10 GbE and 12 ports of 40 GbE. Ports 59 and 60 on the switch can be subdivided into four 10 GbE ports, providing ample capacity for our application.

      The transpose module is designed with extra bits reserved in the blocks related to the parameter fid. The number of bits in the fid parameter is directly linked to the maximum number of F-engines in the correlator system. By utilizing these additional bits, the correlator can be configured to accommodate a greater number of input signals. In terms of the F-engine, theoretically, there could be an infinite number of input channels, and the number of ROACH2 boards can be increased based on the input channel number. The capacity of X-engine determines the upper limit of input channels, depending on the processing capacity of the GPU servers for a single frequency point. Since each frequency point should contain all the input channel information, the processing capability of the GPU servers for a single frequency channel affects maximum number of input channels. Currently, a single server theoretically has the capability to handle over 20,000 input channels if it only processes one frequency channel. However, this may need an extremely large-scale switch network.

      The relationship between the number of input channels and the output data rate is as shown in Eq. 1: 1 2 × N N + 1 × f _ c h × 2 × f _ b / I n t e g r a t i o n _ t i m e where N represents the number of input channels, f _ c h represents the number of frequency channels, f _ b represents the number of bytes in a single frequency channel. The multiplying factor 2 is because frequency channels are complex numbers.

      2.3 X-engine

      The primary role of the X-engine is to perform cross-correlation calculations. The X-engine receives the data from the F-engine in packets, which are then delivered to different computing servers, where the conjugate multiplication and accumulation (CMAC) are done. The hardware for this part consists mainly of six Supermicro servers and one Dell server. We list the main equipment of the X-engine in Table 1.

      List of X-engine equipment.

      Supermicro (4U) Dell (2U)
      PCIe 3.0 4.0
      Graphics Card Dual GTX 690 One RTX 3080
      CPU Dual Intel E5-2670 Dual Intel E5-2699
      NIC Dual 2-port 10 GbE Dual 2-port 10 GbE
      Memory 128 GB RAM 256 GB RAM
      OS Centos7 Rocky8

      The X-engine part consists of seven GPU nodes. To ensure that they integrate the data at exactly the same time duration, they must be synchronized together. A script has been developed to achieve this, and its basic procedure is as follows. First, initialize the hashpipes of 7 GPU nodes; Second, start the hashpipe program of the first GPU node; Third, read out the MCNT value in the current packet and calculate a future (several seconds later) MCNT value to act as the aligning time point. Finally, all GPU nodes work simultaneously when their hashpipe threads receive a packet contains the calculated aligning MCNT value.

      The data operation in the X-engine is managed by the hashpipe software running on CPU and GPU heterogeneous servers. Hashpipe was originally developed as an efficient shared pipe engine for the National Astronomical Observatory, the Universal Green Bank Astrospectrograph (Prestage et al., 2009). It was later adapted by David MacMahon of U.C. Berkeley, it can be used for FX correlators (Parsons et al., 2008), pulsar observations (Pei et al., 2021), Fast Radio Bursts detection (Yu et al., 2022a) and the search for extraterrestrial civilizations (Price et al., 2018). The core of the hashpipe is the flexible ring buffer. It simulates contiguous memory blocks, realizes data transmission and sharing among multiple threads, and uses the central processing unit to control startup and shutdown, etc. The ring buffer is used to temporarily store and deliver the data packets to ensure that the data is captured quickly and distributed in the correct order.

      Each hashpipe instance in our system has a total of four threads and three buffers, as shown in Figure 6. To process the four 10 GbE ports data stream, four hashpipe instances are created. In each instance, the basic data process can be concluded as follows. First, net_thread receives the packets from the GPU server’s 10 GbE port. According to the packet format, the valid data is extracted and the packet header is analyzed. Packets are time-stamped, and if they arrive at the GPU server out of order, they can be rearranged into the appropriate time series and written to the input data buffer, which is passed onto the next thread once a consecutive block of data is filled. The fluff thread “fluffing” the data, fluffs 4bit+4bit complex data into 8bit+8bit complex data in the thread. The data is “fluffed” and temporarily stored in the GPU input data buffer until it is fetched by gpu_thread. Then gpu_thread transfers the data to the graphics processor to perform complex calculations and then writes the results to the output data buffer. The CMAC process uses the xGPU 7 (Clark et al., 2013), which is written in CUDA-C and is optimized on GPU memory resources by specific thread tasks. The cross-correlation algorithm involves computing the cross-power spectrum at a specific frequency observed by a pair of stations, known as a baseline. By processing a sufficient number of baselines, a detailed power spectrum representation can be derived, enabling the generation of an image of the sky through an inverse Fourier transform in the spatial domain. The algorithm’s implementation on Nvidia’s Fermi architecture sustains high performance by utilizing a software-managed cache, a multi-level tiling strategy, and efficient data streaming over the PCIe bus, showcasing significant advancements over previous GPU implementations. The output thread gets the data from the output data buffer and transmits it to the storage server through the switch. Hashpipe provides a status buffer that extract key-value pairs in each thread. This key value is updated every running cycle. The status can be viewed using a GUI monitor that has been written in both Python and Ruby.

      Hashpipe thread manager diagram.

      2.4 Data storage

      At the beginning of the design, two schemes for data storage were considered. One is that the data is stored on each GPU server, and it is read and combined when used. Due to the large number of GPU servers, this method is too cumbersome. The other is that the data is transmitted from each GPU server to the master computer in real-time, and the data is stored in the master computer. This method is convenient for data use and processing, so the second scheme is adopted.

      Each GPU node has 4 hashpipe instances, and the output thread of each hashpipe instance sends data to a dedicated destination port. A total of 28 different UDP ports are used for the 7 GPU servers. The data acquisition script, written in Python, collects data from all 28 UDP ports and combines them. Currently, the integration time is set to approximately 4 s, resulting in a data rate of about 150 Mbps for each network port. The total data rate for all seven servers with 28 ports amounts to approximately 4.2 Gbps. Therefore, a 10 GbE network is capable of handling the data transmission. Finally, the data are saved onto hard drives in the HDF5 format. Additional information such as integration time, observation time, telescope details, and observer information is also automatically saved in the file.

      2.5 CNS control module

      During the drift scan observation of the Tianlai cylinder array, the system needs to be calibrated by a calibrator noise source (CNS). The CNS periodically broadcasts a broadband white noise of stable magnitude from a fixed position, so the system gain can be recovered (Zuo et al., 2021; Zuo et al., 2019). One requirement in the data processing part is to let the CNS’s signal fall exactly in one integration time interval, so it needs to be aligned to the integration time. To achieve this, a logical ON/OFF signal from the cylinder correlator is necessary. In order to meet this requirement, we have introduced a noise source control function to the correlator system. This control function is implemented through the noise_source_control block in the F-engine, as shown in Figure 7A.

      (A) Design of the calibrator noise source control module. (B) The periodic CNS signal in the cross-correlation results is aligned with the integration time.

      First, the script enables counter_en block to initialize the module. Second, the hashpipe instance on the GPU node returns the MCNT value of its current packet. The script uses this value to calculate the CNS MCNT value (an MCNT value at a future time, when the MCNT value in the F-engine is equal to this value, the CNS is turned ON) and sets that CNS MCNT to reg31_0 block and reg47_32 block. Third, the CNS on/off period is converted to the change value of MCNT and set period_mcnt block to this value. Fourth, set the GPIO’s working time to light block, which is on the ROACH2 board. Finally, the GPIO periodically sends out a logical signal to turn the CNS on or off.

      We tested the accuracy of the CNS control module and its actual output result, as shown in Figure 7B. The CNS is activated based on a pre-set MCNT value and is aligned precisely with the integration time interval.

      3 Testing and experimentation 3.1 ADC testing

      The importance of ADCs lies in their quality and performance, as these factors bear a direct impact on the overall functionality of the systems they inhabit. To verify the sampling correctness of the ADC, we input a 15.625 MHz sinusoidal wave signal into the ADC and fit the digitized data. The sampling points and fitting result are shown in Figure 8A. The correlator system requires the ADCs to have linearly sampled output at different signal levels. We plot the logarithm of the standard deviation of the ADC output with three different gain coefficients as a function of different input power levels, and the results are shown in Figure 8B. No obvious nonlinearity is found in the testing power range.

      (A) The ADC output data points and the fitting curve. (B) the logarithm of the standard deviation of the ADC output with gain coefficient 1, 2, and 4 was plotted at different input power levels.

      3.2 Phase testing

      We verify the phase of the visibility (cross-correlation result) by two input signals, whose phase difference is determined by a cable length difference. We use a noise source generator to output the white noise signal and the signal is divided into two ways by a power splitter. Then, the two signals are fed into the ROACH2 board through two radio cables of different lengths. The cable length difference is 15 m. The two signals can be depicted as S 1 = A 1 e i ( 2 π f t + ϕ 0 ) and S 2 = A 2 e i ( 2 π f ( t + τ ) + ϕ 0 ) , where A is the wave amplitude, ϕ 0 is an arbitrary initial phase, f is frequency, τ is the delay incurred by the unequal-length RF cables. The visibility of two signals is V = < S 1 * S 2 > = A 1 A 2 e i 2 π f τ The cable length difference of the two input signals is fixed, so the delay τ is constant over time. As Eq. 2 shows, the phase Φ = 2 π τ f , Φ is a linear function of frequency, and the slope k = 2 π τ . The delay τ = Δ l / c ̃ , where Δ l is the cable length difference, c ̃ is the propagation speed of RF signal in coaxial cable.

      The measured waterfall 2D plot of phase of visibility output by our correlator in this experiment is plotted in Figures 9A, 1D plot (at one integration time) of phase as a function of frequency is shown in Figure 9B. By calculating the curve slope in Figure 9B, we obtain a propagation speed in the coaxial cable of about 0.78 c (0.78 times speed of light in vacuum), which is consistent with the specification of the RF cable.

      Phase correctness check of the correlator by two signals with fixed phase difference. The measured phase is consistent with the length difference of the two cables. (A) Phase fringes of the two input signals. (B) The relationship between frequency and phase.

      3.3 Linearity of correlator system

      The linearity of our correlator is verified by comparing the input power levels and the output amplitudes. The results are shown in Figure 10. Considering multiple factors, we have set the ADC gain coefficient for the correlator to 2. We can draw the conclusion that the linear dynamic range of our correlator is between −22 dBm and 0 dBm within the 125 MHz bandpass. In realistic observations, power levels output from the receivers vary 10 dB at most, so the 22 dB dynamic range of our correlator can satisfy our observation requirement.

      Linearity of correlator system. The ADC gain coefficients of 1, 2, and 4 were used for system linearity testing. A gain coefficient of 2 was selected as the daily operational parameter value for the correlator.

      3.4 Sky observation

      The whole frequency band of each feed ranging from 692.8125 to 802.1875 MHz, is divided to 28 sub-bands. These sub-band have been sent to different hashpipe instances for correlation calculation. The final spectra are the combination of these 28 sub-bands. Some spectra of feeds (A10X, A19Y, B27X, and C12X) are plotted in Figure 11. The Tianlai cylinder array is aligned in the N-S direction and consists of three adjacent cylinders. They are designated as A, B and C from east to west, and have 31, 32, and 33 feeds respectively. Each dual linear polarization feed generates two signal outputs. We use “X” to denote the output for the polarization along the N-S direction and “Y” along E-W direction. Spectra of the selected feeds in Figure 11 are from three cylinders, and are smooth in adjacent frequency sub-bands. No obvious inconsistent processing amplitude in different sub-bands are found.

      Spectral response of feed A1Y, A10X, B3Y, and C12X.

      In these spectra, a periodic fluctuation of about 6.8 MHz can be seen. They have been confirmed to result from the standing wave in the 15-m feed cable (Li et al., 2021).

      We made 4.4 h (16,000 s) of continuous observation since the night of August 7th, 2023, and the data are shown in Figure 12. The fringe of radio bright source Cassiopeia A occurred around 8000th second.

      Observational phase of Cassiopeia A at around 8000th second.

      The continuous operation ability of the correlator is tested, and there is no fault in continuous operation for a month. We plot 4 days’ continuous observation data of three baselines as a function of LST (Local sidereal time) and frequency, as shown in Figure 13. The subplots from top to bottom show the baselines for two feeds (a) on the same cylinder, (b) on two adjacent cylinders, and (c) on two non-adjacent cylinders. Each subplot shows the result of four consecutive days starting from September 6th, 2023; each day is a sub-panel from bottom to top.

      Typical phase of raw visibilities as a function of LST and frequency for 4 days starting from Sept. 6th, 2023. (A) Baseline A3Y-A15Y; (B) Baseline A3Y-B18Y; (C) Baseline A3Y-C15Y.

      3.5 Power consumption

      All devices are powered by PDU (Power Distribution Unit), and the voltage and current usage of the devices can be monitored through the PDU management interface. The entire correlator system uses a total of 3 PDUs. The six ROACH2 boards and the master computer are connected to one PDU. The first 7 GPU servers and the 10 GbE switch are connected to another PDU. The last 7 servers and the 1 GbE Ethernet switch are connected to the third PDU.

      The total power of the F-engine is 220 V × 3.5 A = 770 W, including six ROACH2 boards and one master computer. The total power of X-engine is 220 V × 17.5 A = 3850 W, including seven GPU servers, one 10 GbE switch, and one 1 GbE switch. Therefore, the total power of the whole correlator system is 770 W + 3850 W = 4,620 W for 192 inputs. This is very energy-efficient for such a large-scale interferometer system.

      4 Summary

      In this paper, the correlator is designed and deployed for the cylinder array with 192 inputs. Based on the basic hybrid structure of the ROACH2-GPU correlator, we have realized the data acquisition and pre-processing function by F-engine, which consists of six ROACH2 boards. The F-engine part is tested, debugged, and analyzed, works in the suitable linear range and the calibrator noise source is controlled in a cadence according to integration time. We conducted hardware testing and data storage design for the X-engine part and realized the complete and orderly data storage of 7 GPU servers. We use a DELL 2020 server, NVIDIA GeForce RTX3080 graphics card, and Rocky 8 system to achieve the X-engine function.

      As Tianlai radio interferometric array is currently extending its scale, the correlator we design can increase the number of ROACH2 boards according to the number of input signals, and set the appropriate number of frequency points and the size of data packets. The X-engine part can use higher-level servers and graphics cards to combine multiple tasks and increase the work tasks of a single server to reduce the number of servers. Our future work is to implement it on larger systems.

      Data availability statement

      The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

      Author contributions

      ZW: Writing–original draft. J-XL: Writing–review and editing. KZ: Software, Writing–original draft. F-QW: Writing–review and editing. Haijun H-JT: Writing–review and editing. C-HN: Writing–review and editing. J-YZ: Writing–review and editing. Z-PC: Writing–review and editing. D-JY: Writing–review and editing. X-LC: Writing–review and editing.

      Funding

      The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. We acknowledge the support by the National SKA Program of China (Nos 2022SKA0110100, 2022SKA0110101, and 2022SKA0130100), the National Natural Science Foundation of China (Nos 12373033, 12203061, 12273070, 12303004, and 12203069), the CAS Interdisciplinary Innovation Team (JCTD-2019-05), the Foundation of Guizhou Provincial Education Department (KY (2023)059), and CAS Youth Interdisciplinary Team. This work is also supported by the office of the leading Group for Cyberspace Affairs, CAS (No. CAS-WX2023PY-0102) and CAS Project for Young Scientists in Basic Research (YSBR-063).

      Conflict of interest

      The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

      Publisher’s note

      All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

      https://tianlai.bao.ac.cn

      https://github.com/TianlaiProject

      https://github.com/david-macmahon/hashpipe

      https://casper.astro.berkeley.edu/wiki/ADC16x250-8_coax_rev_2

      https://casper.astro.berkeley.edu/wiki/Block_Documentation

      https://casper.astro.berkeley.edu/wiki/PAPER_Correlator_EQ

      https://github.com/GPU-correlators/xGPU

      References Ali Z. S. Parsons A. R. Zheng H. Pober J. C. Liu A. Aguirre J. E. (2015). PAPER-64 constraints on reionization: THE 21 cm power spectrum ATz= 8.4. ApJ 809, 61. 10.1088/0004-637X/809/1/61 Ansari R. Campagne J. E. Colom P. Le Goff J. M. Magneville C. Martin J. M. (2012). 21 cm observation of large-scale structures atz ∼ 1: instrument sensitivity and foreground subtraction. A&A 540, A129. 10.1051/0004-6361/201117837 Chen X. (2011). Radio detection of dark energy&amp;mdash;the Tianlai project. Mech. Astronomica 41, 13581366. 10.1360/132011-972 Cianciara A. J. Anderson C. J. Chen X. Chen Z. Geng J. Li J. (2017). Simulation and testing of a linear array of modified four-square feed antennas for the Tianlai cylindrical radio telescope. J. Astron. Instrum. 6, 1750003. 10.1142/S2251171717500039 Clark M. A. Plante P. L. Greenhill L. J. (2013). Accelerating radio astronomy cross-correlation with graphics processing units. Int. J. high Perform. Comput. Appl. 27, 178192. 10.1177/1094342012444794 Das S. Anderson C. J. Ansari R. (2018). “Society of photo-optical instrumentation engineers (SPIE) conference series,” in Millimeter, submillimeter, and far-infrared detectors and instrumentation for astronomy IX. Editors Zmuidzinas J. Gao J.-R. , 10708. 10.1117/12.2313031 1070836 Hickish J. Abdurashidova Z. Ali Z. (2016). A decade of developing radio-astronomy instrumentation using CASPER open-source Technology. J. Astronomical Instrum. 5, 1641001. 10.1142/S2251171716410014 Kocz J. Greenhill L. Barsdell B. (2015). Digital signal processing using stream high performance computing. J. Astronomical Instrum. 4, 1550003. 10.1142/S2251171715500038 Kovetz E. Breysse P. C. Lidz A. (2019). Astrophysics and cosmology with line-intensity mapping. BAAS 51, 101. 10.48550/arXiv.1903.04496 Kwak J. Podczerwinski J. Timbie P. Ansari R. Marriner J. Stebbins A. (2024). The effects of the local environment on a compact radio interferometer I: cross-coupling in the Tianlai dish pathfinder array. J. Astron. Instrum. 13, 2450002. 10.1142/S2251171724500028 Li J. Zuo S. Wu F. Wang Y. Zhang J. Sun S. (2020). The Tianlai Cylinder Pathfinder array: system functions and basic performance analysis. Mech. Astronomy 63, 129862. 10.1007/s11433-020-1594-8 Li J.-X. Wu F.-Q. Sun S.-J. Yu Z. J. Zuo S. F. Liu Y. F. (2021). Reflections and standing waves on the Tianlai cylinder array. Astronomy Astrophysics 21, 059. 10.1088/1674-4527/21/3/059 Liu A. Shaw J. R. (2020). Data analysis for precision 21 cm cosmology. PASP 132, 062001. 10.1088/1538-3873/ab5bfd MacMahon D. H. Price D. C. Lebofsky M. Siemion A. P. V. Croft S. DeBoer D. (2018). The breakthrough listen search for intelligent life: a wideband data recorder system for the robert C. Byrd Green Bank telescope. Publ. Astron. Soc. Pac. 130, 044502. 10.1088/1538-3873/aa80d2 Niu C.-H. Wang Q.-X. MacMahon D. Wu F. Q. Chen X. L. Li J. X. (2019). The design and implementation of a ROACH2+GPU based correlator on the Tianlai dish array. Res. Astron. Astrophys. 19, 102. 10.1088/1674-4527/19/7/102 Parsons A. Backer D. Siemion A. Chen H. Werthimer D. Droz P. (2008). A scalable correlator architecture based on modular FPGA hardware, reuseable gateware, and data packetization. Publ. Astronomical Soc. Pac. 120, 12071221. 10.1086/593053 Parsons A. R. Backer D. C. Foster G. S. Wright M. C. H. Bradley R. F. Gugliucci N. E. (2010). THE PRECISION ARRAY FOR PROBING THE EPOCH OF RE-IONIZATION: EIGHT STATION RESULTS. Astronomical J. 139, 14681480. 10.1088/0004-6256/139/4/1468 Pei X. Li J. Wang N. Ergesh T. Duan X. F. Ma J. (2021). Design of a multi-function high-speed digital baseband data acquisition system. Res. Astron. Astrophys. 21, 248. 10.1088/1674-4527/21/10/248 Perdereau O. Ansari R. Stebbins A. Timbie P. T. Chen X. Wu F. (2022). The Tianlai dish array low-z surveys forecasts. MNRAS 517, 46374655. 10.1093/mnras/stac2832 Prestage R. M. Constantikes K. T. Hunter T. R. King L. Lacasse R. Lockman F. (2009). The Green Bank telescope. Proc. IEEE 97, 13821390. 10.1109/jproc.2009.2015467 Price D. C. (2021). in The WSPC handbook of astronomical instrumentation: volume 1: radio astronomical instrumentation (World Scientific), 159179. Price D. C. MacMahon D. H. Lebofsky M. Croft S. DeBoer D. Enriquez J. E. (2018). The Breakthrough Listen search for intelligent life: wide-bandwidth digital instrumentation for the CSIRO Parkes 64-m telescope. Publ. Astron. Soc. Aust. 35, e041. 10.1017/pasa.2018.36 Sun S. Li J. Wu F. Timbie P. Ansari R. Geng J. (2022). The electromagnetic characteristics of the Tianlai cylindrical pathfinder array. Astronomy Astrophysics 22, 065020. 10.1088/1674-4527/ac684d Thompson A. R. Moran J. M. Swenson G. W. J. (2017). Interferometry and synthesis in radio astronomy. 3rd Edition. 10.1007/978-3-319-44431-4 Wu F. Li J. Zuo S. Chen X. Das S. Marriner J. P. (2021). The Tianlai dish pathfinder array: design, operation, and performance of a prototype transit radio interferometer. MNRAS 506, 34553482. 10.1093/mnras/stab1802 Wu F. Wang Y. Zhang J. 2014, Site selection for the Tianlai experiment proceedings of the XXXIst URSI general assembly and scientific symposium to be held in Beijing, China (CIE), August 17–23, 2014in 2014 XXXIth URSI General Assembly and Scientific Symposium (URSI GASS). 16-23 August 2014. Beijing, China, 14. Xu Y. Wang X. Chen X. (2015). FORECASTS ON THE DARK ENERGY AND PRIMORDIAL NON-GAUSSIANITY OBSERVATIONS WITH THE TIANLAI CYLINDER ARRAY. ApJ 798, 40. 10.1088/0004-637X/798/1/40 Yu K. Wu F. Zuo S. Li J. Sun S. Wang Y. (2023). A simulation of calibration and map-making errors of the Tianlai cylinder pathfinder array. Astronomy Astrophysics 23, 105008. 10.1088/1674-4527/acf032 Yu K. Zuo S. Wu F. Wang Y. Chen X. (2024). Application of regularization methods in the sky map reconstruction of the Tianlai cylinder pathfinder array. Res. Astron. Astrophys. 24, 025002. 10.1088/1674-4527/ad1223 Yu Z. Deng F. Niu C. (2022b). Astronomer’s Telegr. 15758, 1. Yu Z. Deng F. Sun S. Niu C. Li J. Wu F. (2022a). A fast radio burst backend for the Tianlai dish pathfinder array. Res. Astron. Astrophys. 22, 125007. 10.1088/1674-4527/ac977c Yu Z. Deng F.-R. Niu C.-H. (2022c). Detection of a bright FRB with the Tianlai cylinder pathfinder array. Astronomer’s Telegr. 15342, 1. Zhang J. Zuo S.-F. Ansari R. Chen X. Li Y. C. Wu F. Q. (2016). Sky reconstruction for the Tianlai cylinder array. Astronomy Astrophysics 16, 158. 10.1088/1674-4527/16/10/158 Zuo S. Li J. Li Y. Santanu D. Stebbins A. Masui K. (2021). Data processing pipeline for Tianlai experiment. Astronomy Comput. 34, 100439. 10.1016/j.ascom.2020.100439 Zuo S. Pen U.-L. Wu F. Li J. Stebbins A. Wang Y. (2019). An eigenvector-based method of radio array calibration and its application to the Tianlai cylinder pathfinder. Astronomical J. 157, 34. 10.3847/1538-3881/aaf4c0
      ‘Oh, my dear Thomas, you haven’t heard the terrible news then?’ she said. ‘I thought you would be sure to have seen it placarded somewhere. Alice went straight to her room, and I haven’t seen her since, though I repeatedly knocked at the door, which she has locked on the inside, and I’m sure it’s most unnatural of her not to let her own mother comfort her. It all happened in a moment: I have always said those great motor-cars shouldn’t be allowed to career about the streets, especially when they are all paved with cobbles as they are at Easton Haven, which are{331} so slippery when it’s wet. He slipped, and it went over him in a moment.’ My thanks were few and awkward, for there still hung to the missive a basting thread, and it was as warm as a nestling bird. I bent low--everybody was emotional in those days--kissed the fragrant thing, thrust it into my bosom, and blushed worse than Camille. "What, the Corner House victim? Is that really a fact?" "My dear child, I don't look upon it in that light at all. The child gave our picturesque friend a certain distinction--'My husband is dead, and this is my only child,' and all that sort of thing. It pays in society." leave them on the steps of a foundling asylum in order to insure [See larger version] Interoffice guff says you're planning definite moves on your own, J. O., and against some opposition. Is the Colonel so poor or so grasping—or what? Albert could not speak, for he felt as if his brains and teeth were rattling about inside his head. The rest of[Pg 188] the family hunched together by the door, the boys gaping idiotically, the girls in tears. "Now you're married." The host was called in, and unlocked a drawer in which they were deposited. The galleyman, with visible reluctance, arrayed himself in the garments, and he was observed to shudder more than once during the investiture of the dead man's apparel. HoME香京julia种子在线播放 ENTER NUMBET 0016jkixdj.com.cn
      hlyhyn.org.cn
      www.ifgcud.com.cn
      eekpls.com.cn
      idhifa.com.cn
      tuuujy.com.cn
      tashout.org.cn
      qmchain.com.cn
      sdiyes.com.cn
      thirdxcx.org.cn
      处女被大鸡巴操 强奸乱伦小说图片 俄罗斯美女爱爱图 调教强奸学生 亚洲女的穴 夜来香图片大全 美女性强奸电影 手机版色中阁 男性人体艺术素描图 16p成人 欧美性爱360 电影区 亚洲电影 欧美电影 经典三级 偷拍自拍 动漫电影 乱伦电影 变态另类 全部电 类似狠狠鲁的网站 黑吊操白逼图片 韩国黄片种子下载 操逼逼逼逼逼 人妻 小说 p 偷拍10幼女自慰 极品淫水很多 黄色做i爱 日本女人人体电影快播看 大福国小 我爱肏屄美女 mmcrwcom 欧美多人性交图片 肥臀乱伦老头舔阴帝 d09a4343000019c5 西欧人体艺术b xxoo激情短片 未成年人的 插泰国人夭图片 第770弾み1 24p 日本美女性 交动态 eee色播 yantasythunder 操无毛少女屄 亚洲图片你懂的女人 鸡巴插姨娘 特级黄 色大片播 左耳影音先锋 冢本友希全集 日本人体艺术绿色 我爱被舔逼 内射 幼 美阴图 喷水妹子高潮迭起 和后妈 操逼 美女吞鸡巴 鸭个自慰 中国女裸名单 操逼肥臀出水换妻 色站裸体义术 中国行上的漏毛美女叫什么 亚洲妹性交图 欧美美女人裸体人艺照 成人色妹妹直播 WWW_JXCT_COM r日本女人性淫乱 大胆人艺体艺图片 女同接吻av 碰碰哥免费自拍打炮 艳舞写真duppid1 88电影街拍视频 日本自拍做爱qvod 实拍美女性爱组图 少女高清av 浙江真实乱伦迅雷 台湾luanlunxiaoshuo 洛克王国宠物排行榜 皇瑟电影yy频道大全 红孩儿连连看 阴毛摄影 大胆美女写真人体艺术摄影 和风骚三个媳妇在家做爱 性爱办公室高清 18p2p木耳 大波撸影音 大鸡巴插嫩穴小说 一剧不超两个黑人 阿姨诱惑我快播 幼香阁千叶县小学生 少女妇女被狗强奸 曰人体妹妹 十二岁性感幼女 超级乱伦qvod 97爱蜜桃ccc336 日本淫妇阴液 av海量资源999 凤凰影视成仁 辰溪四中艳照门照片 先锋模特裸体展示影片 成人片免费看 自拍百度云 肥白老妇女 女爱人体图片 妈妈一女穴 星野美夏 日本少女dachidu 妹子私处人体图片 yinmindahuitang 舔无毛逼影片快播 田莹疑的裸体照片 三级电影影音先锋02222 妻子被外国老头操 观月雏乃泥鳅 韩国成人偷拍自拍图片 强奸5一9岁幼女小说 汤姆影院av图片 妹妹人艺体图 美女大驱 和女友做爱图片自拍p 绫川まどか在线先锋 那么嫩的逼很少见了 小女孩做爱 处女好逼连连看图图 性感美女在家做爱 近距离抽插骚逼逼 黑屌肏金毛屄 日韩av美少女 看喝尿尿小姐日逼色色色网图片 欧美肛交新视频 美女吃逼逼 av30线上免费 伊人在线三级经典 新视觉影院t6090影院 最新淫色电影网址 天龙影院远古手机版 搞老太影院 插进美女的大屁股里 私人影院加盟费用 www258dd 求一部电影里面有一个二猛哥 深肛交 日本萌妹子人体艺术写真图片 插入屄眼 美女的木奶 中文字幕黄色网址影视先锋 九号女神裸 和骚人妻偷情 和潘晓婷做爱 国模大尺度蜜桃 欧美大逼50p 西西人体成人 李宗瑞继母做爱原图物处理 nianhuawang 男鸡巴的视屏 � 97免费色伦电影 好色网成人 大姨子先锋 淫荡巨乳美女教师妈妈 性nuexiaoshuo WWW36YYYCOM 长春继续给力进屋就操小女儿套干破内射对白淫荡 农夫激情社区 日韩无码bt 欧美美女手掰嫩穴图片 日本援交偷拍自拍 入侵者日本在线播放 亚洲白虎偷拍自拍 常州高见泽日屄 寂寞少妇自卫视频 人体露逼图片 多毛外国老太 变态乱轮手机在线 淫荡妈妈和儿子操逼 伦理片大奶少女 看片神器最新登入地址sqvheqi345com账号群 麻美学姐无头 圣诞老人射小妞和强奸小妞动话片 亚洲AV女老师 先锋影音欧美成人资源 33344iucoom zV天堂电影网 宾馆美女打炮视频 色五月丁香五月magnet 嫂子淫乱小说 张歆艺的老公 吃奶男人视频在线播放 欧美色图男女乱伦 avtt2014ccvom 性插色欲香影院 青青草撸死你青青草 99热久久第一时间 激情套图卡通动漫 幼女裸聊做爱口交 日本女人被强奸乱伦 草榴社区快播 2kkk正在播放兽骑 啊不要人家小穴都湿了 www猎奇影视 A片www245vvcomwwwchnrwhmhzcn 搜索宜春院av wwwsee78co 逼奶鸡巴插 好吊日AV在线视频19gancom 熟女伦乱图片小说 日本免费av无码片在线开苞 鲁大妈撸到爆 裸聊官网 德国熟女xxx 新不夜城论坛首页手机 女虐男网址 男女做爱视频华为网盘 激情午夜天亚洲色图 内裤哥mangent 吉沢明歩制服丝袜WWWHHH710COM 屌逼在线试看 人体艺体阿娇艳照 推荐一个可以免费看片的网站如果被QQ拦截请复制链接在其它浏览器打开xxxyyy5comintr2a2cb551573a2b2e 欧美360精品粉红鲍鱼 教师调教第一页 聚美屋精品图 中韩淫乱群交 俄罗斯撸撸片 把鸡巴插进小姨子的阴道 干干AV成人网 aolasoohpnbcn www84ytom 高清大量潮喷www27dyycom 宝贝开心成人 freefronvideos人母 嫩穴成人网gggg29com 逼着舅妈给我口交肛交彩漫画 欧美色色aV88wwwgangguanscom 老太太操逼自拍视频 777亚洲手机在线播放 有没有夫妻3p小说 色列漫画淫女 午间色站导航 欧美成人处女色大图 童颜巨乳亚洲综合 桃色性欲草 色眯眯射逼 无码中文字幕塞外青楼这是一个 狂日美女老师人妻 爱碰网官网 亚洲图片雅蠛蝶 快播35怎么搜片 2000XXXX电影 新谷露性家庭影院 深深候dvd播放 幼齿用英语怎么说 不雅伦理无需播放器 国外淫荡图片 国外网站幼幼嫩网址 成年人就去色色视频快播 我鲁日日鲁老老老我爱 caoshaonvbi 人体艺术avav 性感性色导航 韩国黄色哥来嫖网站 成人网站美逼 淫荡熟妇自拍 欧美色惰图片 北京空姐透明照 狼堡免费av视频 www776eom 亚洲无码av欧美天堂网男人天堂 欧美激情爆操 a片kk266co 色尼姑成人极速在线视频 国语家庭系列 蒋雯雯 越南伦理 色CC伦理影院手机版 99jbbcom 大鸡巴舅妈 国产偷拍自拍淫荡对话视频 少妇春梦射精 开心激动网 自拍偷牌成人 色桃隐 撸狗网性交视频 淫荡的三位老师 伦理电影wwwqiuxia6commqiuxia6com 怡春院分站 丝袜超短裙露脸迅雷下载 色制服电影院 97超碰好吊色男人 yy6080理论在线宅男日韩福利大全 大嫂丝袜 500人群交手机在线 5sav 偷拍熟女吧 口述我和妹妹的欲望 50p电脑版 wwwavtttcon 3p3com 伦理无码片在线看 欧美成人电影图片岛国性爱伦理电影 先锋影音AV成人欧美 我爱好色 淫电影网 WWW19MMCOM 玛丽罗斯3d同人动画h在线看 动漫女孩裸体 超级丝袜美腿乱伦 1919gogo欣赏 大色逼淫色 www就是撸 激情文学网好骚 A级黄片免费 xedd5com 国内的b是黑的 快播美国成年人片黄 av高跟丝袜视频 上原保奈美巨乳女教师在线观看 校园春色都市激情fefegancom 偷窥自拍XXOO 搜索看马操美女 人本女优视频 日日吧淫淫 人妻巨乳影院 美国女子性爱学校 大肥屁股重口味 啪啪啪啊啊啊不要 操碰 japanfreevideoshome国产 亚州淫荡老熟女人体 伦奸毛片免费在线看 天天影视se 樱桃做爱视频 亚卅av在线视频 x奸小说下载 亚洲色图图片在线 217av天堂网 东方在线撸撸-百度 幼幼丝袜集 灰姑娘的姐姐 青青草在线视频观看对华 86papa路con 亚洲1AV 综合图片2区亚洲 美国美女大逼电影 010插插av成人网站 www色comwww821kxwcom 播乐子成人网免费视频在线观看 大炮撸在线影院 ,www4KkKcom 野花鲁最近30部 wwwCC213wapwww2233ww2download 三客优最新地址 母亲让儿子爽的无码视频 全国黄色片子 欧美色图美国十次 超碰在线直播 性感妖娆操 亚洲肉感熟女色图 a片A毛片管看视频 8vaa褋芯屑 333kk 川岛和津实视频 在线母子乱伦对白 妹妹肥逼五月 亚洲美女自拍 老婆在我面前小说 韩国空姐堪比情趣内衣 干小姐综合 淫妻色五月 添骚穴 WM62COM 23456影视播放器 成人午夜剧场 尼姑福利网 AV区亚洲AV欧美AV512qucomwwwc5508com 经典欧美骚妇 震动棒露出 日韩丝袜美臀巨乳在线 av无限吧看 就去干少妇 色艺无间正面是哪集 校园春色我和老师做爱 漫画夜色 天海丽白色吊带 黄色淫荡性虐小说 午夜高清播放器 文20岁女性荫道口图片 热国产热无码热有码 2015小明发布看看算你色 百度云播影视 美女肏屄屄乱轮小说 家族舔阴AV影片 邪恶在线av有码 父女之交 关于处女破处的三级片 极品护士91在线 欧美虐待女人视频的网站 享受老太太的丝袜 aaazhibuo 8dfvodcom成人 真实自拍足交 群交男女猛插逼 妓女爱爱动态 lin35com是什么网站 abp159 亚洲色图偷拍自拍乱伦熟女抠逼自慰 朝国三级篇 淫三国幻想 免费的av小电影网站 日本阿v视频免费按摩师 av750c0m 黄色片操一下 巨乳少女车震在线观看 操逼 免费 囗述情感一乱伦岳母和女婿 WWW_FAMITSU_COM 偷拍中国少妇在公车被操视频 花也真衣论理电影 大鸡鸡插p洞 新片欧美十八岁美少 进击的巨人神thunderftp 西方美女15p 深圳哪里易找到老女人玩视频 在线成人有声小说 365rrr 女尿图片 我和淫荡的小姨做爱 � 做爱技术体照 淫妇性爱 大学生私拍b 第四射狠狠射小说 色中色成人av社区 和小姨子乱伦肛交 wwwppp62com 俄罗斯巨乳人体艺术 骚逼阿娇 汤芳人体图片大胆 大胆人体艺术bb私处 性感大胸骚货 哪个网站幼女的片多 日本美女本子把 色 五月天 婷婷 快播 美女 美穴艺术 色百合电影导航 大鸡巴用力 孙悟空操美少女战士 狠狠撸美女手掰穴图片 古代女子与兽类交 沙耶香套图 激情成人网区 暴风影音av播放 动漫女孩怎么插第3个 mmmpp44 黑木麻衣无码ed2k 淫荡学姐少妇 乱伦操少女屄 高中性爱故事 骚妹妹爱爱图网 韩国模特剪长发 大鸡巴把我逼日了 中国张柏芝做爱片中国张柏芝做爱片中国张柏芝做爱片中国张柏芝做爱片中国张柏芝做爱片 大胆女人下体艺术图片 789sss 影音先锋在线国内情侣野外性事自拍普通话对白 群撸图库 闪现君打阿乐 ady 小说 插入表妹嫩穴小说 推荐成人资源 网络播放器 成人台 149大胆人体艺术 大屌图片 骚美女成人av 春暖花开春色性吧 女亭婷五月 我上了同桌的姐姐 恋夜秀场主播自慰视频 yzppp 屄茎 操屄女图 美女鲍鱼大特写 淫乱的日本人妻山口玲子 偷拍射精图 性感美女人体艺木图片 种马小说完本 免费电影院 骑士福利导航导航网站 骚老婆足交 国产性爱一级电影 欧美免费成人花花性都 欧美大肥妞性爱视频 家庭乱伦网站快播 偷拍自拍国产毛片 金发美女也用大吊来开包 缔D杏那 yentiyishu人体艺术ytys WWWUUKKMCOM 女人露奶 � 苍井空露逼 老荡妇高跟丝袜足交 偷偷和女友的朋友做爱迅雷 做爱七十二尺 朱丹人体合成 麻腾由纪妃 帅哥撸播种子图 鸡巴插逼动态图片 羙国十次啦中文 WWW137AVCOM 神斗片欧美版华语 有气质女人人休艺术 由美老师放屁电影 欧美女人肉肏图片 白虎种子快播 国产自拍90后女孩 美女在床上疯狂嫩b 饭岛爱最后之作 幼幼强奸摸奶 色97成人动漫 两性性爱打鸡巴插逼 新视觉影院4080青苹果影院 嗯好爽插死我了 阴口艺术照 李宗瑞电影qvod38 爆操舅母 亚洲色图七七影院 被大鸡巴操菊花 怡红院肿么了 成人极品影院删除 欧美性爱大图色图强奸乱 欧美女子与狗随便性交 苍井空的bt种子无码 熟女乱伦长篇小说 大色虫 兽交幼女影音先锋播放 44aad be0ca93900121f9b 先锋天耗ばさ无码 欧毛毛女三级黄色片图 干女人黑木耳照 日本美女少妇嫩逼人体艺术 sesechangchang 色屄屄网 久久撸app下载 色图色噜 美女鸡巴大奶 好吊日在线视频在线观看 透明丝袜脚偷拍自拍 中山怡红院菜单 wcwwwcom下载 骑嫂子 亚洲大色妣 成人故事365ahnet 丝袜家庭教mp4 幼交肛交 妹妹撸撸大妈 日本毛爽 caoprom超碰在email 关于中国古代偷窥的黄片 第一会所老熟女下载 wwwhuangsecome 狼人干综合新地址HD播放 变态儿子强奸乱伦图 强奸电影名字 2wwwer37com 日本毛片基地一亚洲AVmzddcxcn 暗黑圣经仙桃影院 37tpcocn 持月真由xfplay 好吊日在线视频三级网 我爱背入李丽珍 电影师傅床戏在线观看 96插妹妹sexsex88com 豪放家庭在线播放 桃花宝典极夜著豆瓜网 安卓系统播放神器 美美网丝袜诱惑 人人干全免费视频xulawyercn av无插件一本道 全国色五月 操逼电影小说网 good在线wwwyuyuelvcom www18avmmd 撸波波影视无插件 伊人幼女成人电影 会看射的图片 小明插看看 全裸美女扒开粉嫩b 国人自拍性交网站 萝莉白丝足交本子 七草ちとせ巨乳视频 摇摇晃晃的成人电影 兰桂坊成社人区小说www68kqcom 舔阴论坛 久撸客一撸客色国内外成人激情在线 明星门 欧美大胆嫩肉穴爽大片 www牛逼插 性吧星云 少妇性奴的屁眼 人体艺术大胆mscbaidu1imgcn 最新久久色色成人版 l女同在线 小泽玛利亚高潮图片搜索 女性裸b图 肛交bt种子 最热门有声小说 人间添春色 春色猜谜字 樱井莉亚钢管舞视频 小泽玛利亚直美6p 能用的h网 还能看的h网 bl动漫h网 开心五月激 东京热401 男色女色第四色酒色网 怎么下载黄色小说 黄色小说小栽 和谐图城 乐乐影院 色哥导航 特色导航 依依社区 爱窝窝在线 色狼谷成人 91porn 包要你射电影 色色3A丝袜 丝袜妹妹淫网 爱色导航(荐) 好男人激情影院 坏哥哥 第七色 色久久 人格分裂 急先锋 撸撸射中文网 第一会所综合社区 91影院老师机 东方成人激情 怼莪影院吹潮 老鸭窝伊人无码不卡无码一本道 av女柳晶电影 91天生爱风流作品 深爱激情小说私房婷婷网 擼奶av 567pao 里番3d一家人野外 上原在线电影 水岛津实透明丝袜 1314酒色 网旧网俺也去 0855影院 在线无码私人影院 搜索 国产自拍 神马dy888午夜伦理达达兔 农民工黄晓婷 日韩裸体黑丝御姐 屈臣氏的燕窝面膜怎么样つぼみ晶エリーの早漏チ○ポ强化合宿 老熟女人性视频 影音先锋 三上悠亚ol 妹妹影院福利片 hhhhhhhhsxo 午夜天堂热的国产 强奸剧场 全裸香蕉视频无码 亚欧伦理视频 秋霞为什么给封了 日本在线视频空天使 日韩成人aⅴ在线 日本日屌日屄导航视频 在线福利视频 日本推油无码av magnet 在线免费视频 樱井梨吮东 日本一本道在线无码DVD 日本性感诱惑美女做爱阴道流水视频 日本一级av 汤姆avtom在线视频 台湾佬中文娱乐线20 阿v播播下载 橙色影院 奴隶少女护士cg视频 汤姆在线影院无码 偷拍宾馆 业面紧急生级访问 色和尚有线 厕所偷拍一族 av女l 公交色狼优酷视频 裸体视频AV 人与兽肉肉网 董美香ol 花井美纱链接 magnet 西瓜影音 亚洲 自拍 日韩女优欧美激情偷拍自拍 亚洲成年人免费视频 荷兰免费成人电影 深喉呕吐XXⅩX 操石榴在线视频 天天色成人免费视频 314hu四虎 涩久免费视频在线观看 成人电影迅雷下载 能看见整个奶子的香蕉影院 水菜丽百度影音 gwaz079百度云 噜死你们资源站 主播走光视频合集迅雷下载 thumbzilla jappen 精品Av 古川伊织star598在线 假面女皇vip在线视频播放 国产自拍迷情校园 啪啪啪公寓漫画 日本阿AV 黄色手机电影 欧美在线Av影院 华裔电击女神91在线 亚洲欧美专区 1日本1000部免费视频 开放90后 波多野结衣 东方 影院av 页面升级紧急访问每天正常更新 4438Xchengeren 老炮色 a k福利电影 色欲影视色天天视频 高老庄aV 259LUXU-683 magnet 手机在线电影 国产区 欧美激情人人操网 国产 偷拍 直播 日韩 国内外激情在线视频网给 站长统计一本道人妻 光棍影院被封 紫竹铃取汁 ftp 狂插空姐嫩 xfplay 丈夫面前 穿靴子伪街 XXOO视频在线免费 大香蕉道久在线播放 电棒漏电嗨过头 充气娃能看下毛和洞吗 夫妻牲交 福利云点墦 yukun瑟妃 疯狂交换女友 国产自拍26页 腐女资源 百度云 日本DVD高清无码视频 偷拍,自拍AV伦理电影 A片小视频福利站。 大奶肥婆自拍偷拍图片 交配伊甸园 超碰在线视频自拍偷拍国产 小热巴91大神 rctd 045 类似于A片 超美大奶大学生美女直播被男友操 男友问 你的衣服怎么脱掉的 亚洲女与黑人群交视频一 在线黄涩 木内美保步兵番号 鸡巴插入欧美美女的b舒服 激情在线国产自拍日韩欧美 国语福利小视频在线观看 作爱小视颍 潮喷合集丝袜无码mp4 做爱的无码高清视频 牛牛精品 伊aⅤ在线观看 savk12 哥哥搞在线播放 在线电一本道影 一级谍片 250pp亚洲情艺中心,88 欧美一本道九色在线一 wwwseavbacom色av吧 cos美女在线 欧美17,18ⅹⅹⅹ视频 自拍嫩逼 小电影在线观看网站 筱田优 贼 水电工 5358x视频 日本69式视频有码 b雪福利导航 韩国女主播19tvclub在线 操逼清晰视频 丝袜美女国产视频网址导航 水菜丽颜射房间 台湾妹中文娱乐网 风吟岛视频 口交 伦理 日本熟妇色五十路免费视频 A级片互舔 川村真矢Av在线观看 亚洲日韩av 色和尚国产自拍 sea8 mp4 aV天堂2018手机在线 免费版国产偷拍a在线播放 狠狠 婷婷 丁香 小视频福利在线观看平台 思妍白衣小仙女被邻居强上 萝莉自拍有水 4484新视觉 永久发布页 977成人影视在线观看 小清新影院在线观 小鸟酱后丝后入百度云 旋风魅影四级 香蕉影院小黄片免费看 性爱直播磁力链接 小骚逼第一色影院 性交流的视频 小雪小视频bd 小视频TV禁看视频 迷奸AV在线看 nba直播 任你在干线 汤姆影院在线视频国产 624u在线播放 成人 一级a做爰片就在线看狐狸视频 小香蕉AV视频 www182、com 腿模简小育 学生做爱视频 秘密搜查官 快播 成人福利网午夜 一级黄色夫妻录像片 直接看的gav久久播放器 国产自拍400首页 sm老爹影院 谁知道隔壁老王网址在线 综合网 123西瓜影音 米奇丁香 人人澡人人漠大学生 色久悠 夜色视频你今天寂寞了吗? 菲菲影视城美国 被抄的影院 变态另类 欧美 成人 国产偷拍自拍在线小说 不用下载安装就能看的吃男人鸡巴视频 插屄视频 大贯杏里播放 wwwhhh50 233若菜奈央 伦理片天海翼秘密搜查官 大香蕉在线万色屋视频 那种漫画小说你懂的 祥仔电影合集一区 那里可以看澳门皇冠酒店a片 色自啪 亚洲aV电影天堂 谷露影院ar toupaizaixian sexbj。com 毕业生 zaixian mianfei 朝桐光视频 成人短视频在线直接观看 陈美霖 沈阳音乐学院 导航女 www26yjjcom 1大尺度视频 开平虐女视频 菅野雪松协和影视在线视频 华人play在线视频bbb 鸡吧操屄视频 多啪啪免费视频 悠草影院 金兰策划网 (969) 橘佑金短视频 国内一极刺激自拍片 日本制服番号大全magnet 成人动漫母系 电脑怎么清理内存 黄色福利1000 dy88午夜 偷拍中学生洗澡磁力链接 花椒相机福利美女视频 站长推荐磁力下载 mp4 三洞轮流插视频 玉兔miki热舞视频 夜生活小视频 爆乳人妖小视频 国内网红主播自拍福利迅雷下载 不用app的裸裸体美女操逼视频 变态SM影片在线观看 草溜影院元气吧 - 百度 - 百度 波推全套视频 国产双飞集合ftp 日本在线AV网 笔国毛片 神马影院女主播是我的邻居 影音资源 激情乱伦电影 799pao 亚洲第一色第一影院 av视频大香蕉 老梁故事汇希斯莱杰 水中人体磁力链接 下载 大香蕉黄片免费看 济南谭崔 避开屏蔽的岛a片 草破福利 要看大鸡巴操小骚逼的人的视频 黑丝少妇影音先锋 欧美巨乳熟女磁力链接 美国黄网站色大全 伦蕉在线久播 极品女厕沟 激情五月bd韩国电影 混血美女自摸和男友激情啪啪自拍诱人呻吟福利视频 人人摸人人妻做人人看 44kknn 娸娸原网 伊人欧美 恋夜影院视频列表安卓青青 57k影院 如果电话亭 avi 插爆骚女精品自拍 青青草在线免费视频1769TV 令人惹火的邻家美眉 影音先锋 真人妹子被捅动态图 男人女人做完爱视频15 表姐合租两人共处一室晚上她竟爬上了我的床 性爱教学视频 北条麻妃bd在线播放版 国产老师和师生 magnet wwwcctv1024 女神自慰 ftp 女同性恋做激情视频 欧美大胆露阴视频 欧美无码影视 好女色在线观看 后入肥臀18p 百度影视屏福利 厕所超碰视频 强奸mp magnet 欧美妹aⅴ免费线上看 2016年妞干网视频 5手机在线福利 超在线最视频 800av:cOm magnet 欧美性爱免播放器在线播放 91大款肥汤的性感美乳90后邻家美眉趴着窗台后入啪啪 秋霞日本毛片网站 cheng ren 在线视频 上原亚衣肛门无码解禁影音先锋 美脚家庭教师在线播放 尤酷伦理片 熟女性生活视频在线观看 欧美av在线播放喷潮 194avav 凤凰AV成人 - 百度 kbb9999 AV片AV在线AV无码 爱爱视频高清免费观看 黄色男女操b视频 观看 18AV清纯视频在线播放平台 成人性爱视频久久操 女性真人生殖系统双性人视频 下身插入b射精视频 明星潜规测视频 mp4 免賛a片直播绪 国内 自己 偷拍 在线 国内真实偷拍 手机在线 国产主播户外勾在线 三桥杏奈高清无码迅雷下载 2五福电影院凸凹频频 男主拿鱼打女主,高宝宝 色哥午夜影院 川村まや痴汉 草溜影院费全过程免费 淫小弟影院在线视频 laohantuiche 啪啪啪喷潮XXOO视频 青娱乐成人国产 蓝沢润 一本道 亚洲青涩中文欧美 神马影院线理论 米娅卡莉法的av 在线福利65535 欧美粉色在线 欧美性受群交视频1在线播放 极品喷奶熟妇在线播放 变态另类无码福利影院92 天津小姐被偷拍 磁力下载 台湾三级电髟全部 丝袜美腿偷拍自拍 偷拍女生性行为图 妻子的乱伦 白虎少妇 肏婶骚屄 外国大妈会阴照片 美少女操屄图片 妹妹自慰11p 操老熟女的b 361美女人体 360电影院樱桃 爱色妹妹亚洲色图 性交卖淫姿势高清图片一级 欧美一黑对二白 大色网无毛一线天 射小妹网站 寂寞穴 西西人体模特苍井空 操的大白逼吧 骚穴让我操 拉好友干女朋友3p