Meta is hard and OpenAI, and the domestic "small model" official announces open source. Where is the "Hundred Models War" going?

  Since the beginning of this year, the global Internet giants have set off a "hundred-model war", and Microsoft, Google, Baidu and Ali have come to the end one after another. After more than half a year of competition, technology giants are welcoming a new round of road disputes around the big model ecology: facing the parameter "ceiling", will the future of the big model be closed or open?

  The open source model can run on a home computer.

  On August 3rd, two open source models, Qwen-7B and Qwen-7B-Chat, were put on the domestic AI developer community "ModelScope", which were Alibaba Cloud Tongyi Qianwen’s 7 billion parameter general model and dialogue model respectively. Both models were open source, free and commercially available.

  According to reports, Tongyi Qianwen Qwen-7B is a pedestal model that supports many languages such as Chinese and English, and it is trained on more than 2 trillion token (text unit) data sets, while Qwen-7B-Chat is a Chinese-English dialogue model based on the pedestal model, which has reached the cognitive level of human beings.In short, the former is like a "foundation" and the latter is a "house" on the foundation.

  The actual test shows that the comprehensive performance of Qwen-7B model is good. Among them, on the English proficiency evaluation benchmark MMLU, the score is generally higher than that of the mainstream models with the same parameter scale, even surpassing some models with 12 billion and 13 billion parameter scales. On the Chinese evaluation C-Eval verification set, the model also achieved the highest score of the same scale. Qwen-7B model is also among the best in evaluating GSM8K in mathematical problem solving ability and HumanEval in code ability.

  That is to say,In the tests of Chinese and English writing, solving mathematical problems and writing codes, Qwen-7B model is properly a "master of learning", and its score even exceeds the international mainstream model with the same parameter level.

  Besides, the industry is more concerned about the usability of Qwen-7B model. As we all know, the training and operation of mainstream large models need special AI training chips (such as NVIDIA A100), which are not only expensive, but also as high as 10,000 — per NVIDIA A100; 15,000 dollars, and it is monopolized by countries such as Europe and the United States, and it is almost impossible to buy it in China.The domestic Qwen-7B model supports the deployment of consumer graphics cards, which is equivalent to a high-performance home computer to run the model.

  Thanks to free commercialization and low threshold, the Qwen-7B model has been put on the shelves, which has attracted the attention of AI developers.In just one day, on the code hosting platform GitHub, the Qwen-7B model has been collected by more than a thousand developers, and most of the questioners are Chinese developers.As Alibaba Cloud said in the statement: "Compared with the lively AI open source ecology in the English-speaking world, the Chinese community lacks an excellent pedestal model. The addition of Tongyi Qianwen is expected to provide more choices for the open source community and promote the open source ecological construction of AI in China. "

  Open source or closed?

  In fact, Qwen-7B model is not the first big open source model. In fact, GPT-2, the predecessor of ChatGPT, is also completely open source. Its code and framework can be used for free on the Internet, and related papers can be consulted. However, after ChatGPT spread all over the world, OpenAI chose closed-source development, and the model codes such as GPT-3 and GPT-4 have become the trade secrets of OpenAI.

  The so-called open source is open source code.For example, once the big model is declared open source, anyone can publicly obtain the model source code, modify it or even redevelop it within the scope of copyright restrictions. To make a simple analogy,The source code is like the manuscript of a painting, and everyone can fill in the colors according to the manuscript to create their own artistic paintings.

  Closed source is just the opposite of open source.Only the source code owner (usually the software developer) has the power to modify the code, others can’t get the "manuscript" and can only buy the finished product from the software developer.

  The advantages and disadvantages of open source and closed source are very obvious. After open source, the big model will undoubtedly attract more developers, and the application of the big model will be more abundant, but the corresponding supervision and commercialization will become a difficult problem, which is prone to the embarrassing situation of "making wedding clothes for others".After all, open source considers ecological co-prosperity, and it is difficult to figure out the economic account of how much money can be earned at this stage, and these problems happen to be opportunities to close the source.

  Open source or closed source, this is a big model of life and death, the international giants have given the answer.

  Meta, the parent company of Facebook, released the big model Llama2 last month, which is open source and free for developers and business partners, while OpenAI firmly chose GPT-4 closed source development, which not only can maintain OpenAI’s leading position in the generative AI industry, but also can earn more revenue. According to the authoritative magazine Fast Company,OpenAI’s revenue in 2023 will reach 200 million US dollars, including providing API data interface services and subscription service fees for chat bots.

  Domestic big models have gradually begun to "go their separate ways".Alibaba Cloud’s General Meaning ModelAs early as April this year, it was announced to be open to enterprises, and the open source of Qwen-7B model will go further.ERNIE Bot of BaiduIt has also recently announced that it will gradually open the plug-in ecosystem to third-party developers to help developers build their own applications based on the Wenxin model.

  In contrast, Huawei does not take the usual path. When the Pangu Big Model 3.0 was released, Huawei Cloud publicly stated that,Pangu modelThe full stack technology is independently innovated by Huawei, and no open source technology is adopted. At the same time, Pangu Big Model will gather numerous industry big data (involving industry secrets, etc.), so Pangu Big Model will not be open source in the future.

  The big parameters are still small and beautiful.

  In addition, the open source of Qwen-7B model brings another thought:How many parameters do we need a big model?

  There is no denying that,The parameter scale of the large model is constantly expanding.Take the GPT model under OpenAI as an example. GPT-1 only contains 117 million parameters, and the parameters of GPT-3 have reached 175 billion, which has increased by more than 1000 times in a few years, while the parameters of GPT-4 have exceeded the trillion level.

  The same is true of large domestic models. Baidu Wenxin model has 260 billion parameters, Tencent mixed-element model has reached 100 billion parameters, Huawei Pangu model has been estimated to be close to GPT-3.5, and ali tong Yida model has officially announced 10 trillion parameters … …According to incomplete statistics, there are at least 79 large-scale models with over 1 billion parameters in China.

  Unfortunately, the larger the parameter, the stronger the capability of the large model. At the World Artificial Intelligence Conference, Wu Yunsheng, vice president of Tencent Cloud, has a very appropriate metaphor: "Just like athletes practicing physical strength, weightlifters need to lift 200 kilograms of barbells, and swimmers need to lift 100 kilograms. Different types of athletes don’t need everyone to practice 200 kilograms of barbells."

  As we all know,The higher the parameters of the large model, the more resources and costs are consumed.However, it is not necessary to blindly pursue "large scale" or "high parameters" to deepen the vertical large-scale model of the industry, but to formulate relevant model parameters according to customer needs. For example, the BioGPT-Large model has only 1.5 billion parameters, but its accuracy in biomedical professional tests is better than that of the general model with 100 billion parameters.

  Sam Altman, co-founder of OpenAI, also publicly stated that OpenAI is approaching the limit of LLM (Large Language Model) scale. The larger the scale, the better the model is, and the parameter scale is no longer an important indicator to measure the quality of the model.

  Wu Di, the head of intelligent algorithm in Volcano Engine, has a similar view. In the long run, reducing costs will become an important factor in the application of large models. "A well-tuned small and medium-sized model may perform as well as a general large model in a specific job, and the cost may be only one tenth of the original."

  At present, almost all domestic science and technology manufacturers have got tickets for big models, but the real road choice has just begun.

Find out the background of natural resources. The third national land survey was launched in Sichuan.


On February 28, 2019, Sichuan Province held the third national video conference on promoting land survey. Deng Gan


Pujiang carried out field investigation. (Photo courtesy of Pujiang County Planning and Natural Resources Bureau)

Land is the foundation of nature, the source of ecology, the key to production and the foundation of survival. Beautiful Sichuan, with a land area of 486,000 square kilometers, ranks the fifth in the country, with all landforms except the ocean. How many types of land are there in this vast land? How is the production and utilization? How to divide land ownership? These problems are closely related to our lives.

In October 2017, the State Council deployed and launched the third national land survey; In September 2018, the State Council decided to adjust to the third national land survey (referred to as "Three Tunes"). The purpose of carrying out the "three adjustments" is to comprehensively find out the national land and resources, comprehensively find out the current situation of national land use, comprehensively grasp the true and accurate national land basic data, improve the natural resources investigation, monitoring and statistics system, strengthen the socialized service of natural resources information, and meet the needs of economic and social development and natural resources management.

At present, the third national land survey in our province is taking the completion of county-level land survey and database construction by the end of May 2019 as the time node to implement the 100-day attack in accordance with the requirements of national unity. The unified data update will be carried out with December 31, 2019 as the standard time. Complete all tasks in 2020.

In the face of huge natural resources, how can Sichuan find out? How to touch it?

A

From "Land" Survey to "Land" Survey

Take the step of unified management of natural resources

According to the relevant requirements of the Land Administration Law of the People’s Republic of China and the Regulations on Land Investigation, China conducts a survey on the basic national conditions and national strength of land every ten years.

Before the "Three Tunes", China had conducted two national land surveys. The first time was a detailed land survey from 1984 to 1996; The second time was the land survey in 2007-2009. Based on the data of two detailed investigations and surveys, the data are supplemented and updated every year according to actual changes. Different from the previous two surveys, this survey was renamed, and the "land" survey was changed to "national territory" survey.

From "land" to "national territory", although it is only a word difference, it highlights the new characteristics of this survey. Behind the "renaming", it reflects the new strategy of the central government’s decision-making, implements the new tasks after the institutional reform, and promotes the new requirements of objectivity and authenticity. According to the institutional reform plan of the State Council and local people’s governments, the competent department of natural resources will be established to uniformly exercise the duties of the owner of natural resources owned by the whole people, and uniformly exercise the duties of controlling the spatial use of all land and ecological protection and restoration. Compared with the third national land survey, the classification of the third national land survey has undergone major changes, and a series of mergers and adjustments have been made. In particular, to meet the needs of building ecological civilization, wetlands have been regarded as first-class land types, and gardens have been adjusted to plantation land, and the first-class land types have increased from 12 to 13. The "three tones" emphasize the authenticity of the data more clearly, and clearly delete the order of determining the land types of "tillage, garden, forest and grass". In order to meet the needs of multi-sector overlapping management, the marking of patches (on the survey work map, land units with basically the same landform and land use type are divided into one category, and then the units are sketched on the topographic map to become patches) is innovatively implemented.

According to the relevant person in charge of the Provincial Department of Natural Resources, the transition from "land" to "national territory" has taken an important step in the transition of centralized and unified investigation of natural resources, laying a solid foundation for the unified exercise of the duties of the owner of all natural resources assets of the whole people and the unified exercise of all national land use control and ecological protection and restoration duties.

In fact, "three adjustments" is the first step of unified management of natural resources. "Only by laying a good foundation can we create conditions for development and utilization."

B

From "Jiulong Water Control" to "A Set of Data"

Find out the family background and lay the foundation for high-quality development

The "Three Tunes" is an important survey of basic national conditions and national strength after China’s development entered a new stage, which not only involves the construction of ecological civilization, but also relates to a series of natural resources conditions and basic judgments of national conditions and national strength for China’s second century struggle goal. "Carrying out the third national land survey is of great significance for thoroughly implementing the decision-making arrangements of the Third and Fourth Plenary Sessions of the Eleventh Provincial Party Committee, comprehensively promoting the high-quality development of the province and promoting the governance of Sichuan to a new level." The relevant person in charge of the Provincial Department of Natural Resources said.

This institutional reform has made it clear that the natural resources management departments should make unified plans for land and space, and uniformly control the use of all natural ecological spaces. Inconsistency in the basic data, coordinate system, planning period and control rules of previous spatial planning will be solved, and the division of space management rights will be more clear, so as to achieve non-stacked space control of a single national space and provide a solid foundation for drawing a blueprint to the end. The basic data for the compilation of land spatial planning depends on the "three tones".

From the perspective of ecological resources protection, the situation of land, water, woodland, grassland, wetland and other resources formed by the "three adjustments" will provide a basis for the overall protection, system restoration and comprehensive management of lakes and grasses in landscape forests and fields. Only by comprehensively finding out the ecological land conditions such as forest land, grassland and wetland in the whole province can we optimize the three boundaries of ecological protection red line, permanent basic farmland and urban development boundary, effectively implement the control of land space use, effectively carry out the action of greening the whole Sichuan, and promote the continuous improvement of ecosystem functions in the upper reaches of the Yangtze River.

From the perspective of high-quality economic development, finding out all kinds of land use conditions will provide a basis for scientifically compiling the province’s land spatial planning and better optimizing the pattern of land development. By reasonably determining the total amount, location and structure of land supply, we will continue to promote industrial transformation and optimization and upgrading, and further promote the supply-side structural reform.

Under the background of natural resources management reform, the work of "three adjustments" is very rich in content. "In the past, the current situation of land use was mainly investigated by the land department, the forest and wetland resources were investigated by the forestry department, the grassland resources were investigated by the agricultural department, and the water resources were investigated by the water resources department … The data and survey standards of various departments were different, resulting in data overlap and waste of administrative resources." The relevant person in charge of the Provincial Department of Natural Resources said that on the basis of the second national land survey and referring to the original survey data of forestry, agriculture, water conservancy and other departments, this survey broke the investigation mode of "Jiulong Water Control" and formed a bottom plate and a set of data.

To implement the unified national deployment and requirements, provinces, cities (prefectures) and counties (cities, districts) have set up the leading group for the third national land survey, headed by the main leaders of the government or leaders in charge, with the deputy secretary-general in charge and the main leaders of natural resources departments as deputy heads, and the leaders in charge of all relevant departments as members, and set up the leading group office and corresponding working groups, especially the technical guidance group, to promote the related issues to be handled resolutely in accordance with the "three adjustments" technical regulations.

C

From traditional means to "internet plus"

Top-level design of technology to realize data traceability management

On the basis of relevant national work requirements, our province, in combination with the actual situation, organized forces to pay close attention to the top-level design of technology, worked out the implementation plan and technical specifications, defined the objectives and tasks, work content, division of tasks among provinces, cities and counties, organization and implementation, technical methods, etc., and reported them to the Ministry of Natural Resources for the record. Unified production of the province’s transportation and water conservancy network, survey boundaries, control area and other basic control key data. Establish a provincial-level "Internet"+proof platform, and formulate management measures for indoor and outdoor verification of quality control of achievements in the province.

In order to make the statistical data more accurate, from June to August, 2018, the provincial "three-adjustment office" held nine training courses, with a total of more than 5,500 people trained and more than 4,400 social workers trained professionally. In order to test and strengthen the training effect, the training also specially arranged the examination link, and finally 3800 people passed the test.

Before the "Three Tunes" work was rolled out on a large scale, our province adopted the pilot method to try first. In addition to the national pilot of new land survey technology in Daying County, our province has also identified four provincial-level pilot demonstration areas and counties, namely Jinjiang District of Chengdu, Renhe District of Panzhihua City, Longmatan District of Luzhou City and Changning County of Yibin City, to carry out the work of "three adjustments" in an orderly manner. "The investigation work in these pilot demonstration areas and counties has played a good role in leading the demonstration and laid the foundation for the full implementation of the’ three adjustments’ work in the province." The relevant person in charge of the Provincial Department of Natural Resources said.

The survey base map is the basic data of the "three tones", which relates to the overall accuracy of the survey. To this end, our province organizes special personnel to regularly connect with the national "three-adjustment office" and the Provincial Surveying and Mapping Geographic Information Bureau, track and implement the production of survey maps, and collect and distribute them in time. At present, the achievements of all counties in the province have been obtained, covering nearly 486,000 square kilometers, including 3.653 million map spots, covering an area of nearly 53.815 million mu. The Provincial Department of Natural Resources also docked relevant units such as civil affairs, ecological and environmental protection, housing and urban construction, transportation, water conservancy, agriculture and rural areas, forestry and grass, surveying and mapping, railways, etc., and did a good job in collecting basic investigation data.

"Internet"+proof platform is the "secret weapon" of "Three Tunes" work. The "Three Tunes" adopts an achievement quality assurance system with "Internet"+proof platform as the core, which assists the interpretation of high-resolution remote sensing images, and the relevant departments of the state and province conduct internal verification through on-the-spot proof photos, which is highly reliable. At the same time, when the field investigators take photos at the scene, they correspond the information such as angles and coordinates with the field photos one by one, which ensures the authenticity and accuracy of the survey data.

Chengdu Qingbaijiang explores "three tones"

Make the survey detailed and the data solid

In mid-January, Longwang Village, Renhe Town, qingbaijiang district, Chengdu. The fog in the morning has not dispersed, and investigator Yan Quanlin has started a day’s work. Sichuan Chuanhe Surveying and Mapping Geographic Information Co., Ltd., where Yan Quanlin works, is responsible for the third national land survey in qingbaijiang district. It is normal for them to go to villages to check the land use situation.

This is a microcosm of the third national land survey in our province. How to do the investigation? What’s the use of investigation? This can be seen from the practice of this small village.

How? "Three adjustments" to the end, the verification does not leak a place.

Basemap is the basis of investigation, which is issued by the state. Based on the latest orthophoto image, according to the image characteristics, the land use types are interpreted patch by patch, and the land use patches are extracted, which is made with reference to the results of the 2016 land change survey database. In order to make the details clearer, Qingbaijiang invested more than 5 million yuan to carry out aerial photography of the whole area at a large scale of 1: 2000. "As small as a tree and a Zhang Shitou table, it can be reflected on the map."

According to the requirements of the "three adjustments", it is necessary to accurately find out the use type, area, ownership and distribution of each piece of land in urban and rural areas throughout the country and establish a land survey database. "What we have to do is to go to every household and truthfully reflect the current situation of land use." Yan Quanlin said.

In the field investigation, investigators will carry a tablet computer with them, which contains an electronic version of the base map, which has the function of positioning, can navigate for investigators, and can also sketch directly if they encounter inconsistent spots. "In the last round of investigation, because the base map was not clear enough, some houses were not seen on the map and were ignored, which caused great inconvenience to farmers." Liu Guanghai, deputy director of the Third Dispatch Office in qingbaijiang district, said that the district has set up leading groups for the Third Dispatch at all levels, and included village cadres in the contact list of the township "Third Dispatch" groups, and followed the investigators to visit the households to ensure that there were no omissions.

It is worth mentioning that, while carrying out the "three adjustments", qingbaijiang district further investigated and verified the collective land ownership, collective construction land use right (including homestead use right), contracted management right, agricultural land use right and forest right, and asked the villagers’ groups to investigate the collective land ownership, so as to lay a solid foundation for vector graphics management of natural resources in the whole region.

What do you think? In-depth publicity is known to everyone, and only by finding out the base can we revitalize resources.

"They often come to investigate, and we all know that they are welcome." In Longwang Village, the "Three Tunes" are not just propaganda documents posted in the village committee office, but major events that every household really cares about. With the cooperation of the masses, the work of "three adjustments" in Qingbaijiang ran out of "acceleration".

From the beginning of the "Three Tunes" work, the town called a meeting of village cadres to convey the connotation of the "Three Tunes" work to the grassroots in time and to fully explain the significance of the "Three Tunes" to rural development. Longwang Village has also held many meetings of members to popularize the knowledge of "Three Tones". Longwang Village is located in a deep hill with poor living conditions. "When there is a drought in summer, we have to drive to pull water up." Xiao Peng, director of the village, said that the survey showed that more than 90% of the villagers were willing to move to the foot of the mountain.

On January 16th, relevant people visited the village and found that every villager could say a few words about the meaning of the "three tones". In Xiao Peng’s view, a clear investigation of the base will lay a good foundation for the next implementation of the increase and decrease linked project. For villager Liu Weicheng, the significance lies in "finding out how wide the house at home is, and then moving to the community and calculating the compensation, it will be clear."

By increasing or decreasing linked projects, villagers will move to concentrated residential areas, which will change the production and living conditions and expand the development space of the village. "We intend to introduce social capital and build a rural tourism project." Xiao Peng said that only by making clear the stock in the village through investigation can the project be planned smoothly and lay a good foundation for rural revitalization. "The people are very supportive of this."

"Three Tones" Tips

A few days ago, the Office of the Leading Group for the Third National Land Survey in the State Council issued the Technical Questions and Answers of the Third National Land Survey, which sorted out 70 specific operational problems and treated the "intractable diseases" encountered in the survey. Some problems are selected here to provide reference for the work of "three adjustments"

1, the new cultivated land in the land consolidation project area, in the investigation whether according to the land consolidation project documents to determine the scope of cultivated land map?

Answer: The plots with cultivated land at present, whether it is land consolidation, farmers’ independent development or reclamation projects, should be investigated according to the field cultivated land scope, and the map boundary should not be determined according to the consolidation scope.

2. The field is irrigated land, which was originally managed according to paddy field. How to investigate?

Answer: For the paddy field survey of paddy-upland rotation, for the irrigated land used for a long time. If the investigation is irrigated land, evidence must be given on the spot.

3. How to investigate the map spots where the land use registered with the right is inconsistent with the actual use?

Answer: According to the actual situation, it is not allowed to investigate directly according to the approved use or planned use of land.

4. There are a large number of rural houses of 100 to 200 square meters in some areas. Is it allowed that the minimum area of construction land in some counties and districts is 100 square meters?

A: All provinces (autonomous regions and municipalities directly under the Central Government) can improve the survey accuracy of construction land in their own provinces (autonomous regions and municipalities directly under the Central Government) and reduce the minimum area index on the map, but the accuracy of the whole province (autonomous regions and municipalities directly under the Central Government) should be guaranteed to be uniform.

5. The cultivated land that has been abandoned for a long time is covered with shrubs, and the ridges of cultivated land are still clearly visible. How to investigate?

Answer: According to the current situation, it is forest land. If farming can be resumed after cleaning, the attribute "Farming can be resumed after cleaning" shall be marked.

6. How to investigate the villagers planting fruit trees on the land registered as cultivated land in the contract certificate?

Answer: According to the current investigation, it should be investigated as plantation land, and it should not be investigated according to the registered use of the contract.

7, within the scope of the village, the field for farming purposes, whether according to the facilities of agricultural land survey?

Answer: If livestock and poultry are raised centrally in a village, it shall be investigated according to the protected agricultural land; if it is within the scope of concentrated contiguous villages, it may be marked with the attribute of 203.

8. How to investigate the green land inside the factory? Answer: The green forest land and grassland in the completed factory are investigated according to the industrial land.

9. There are many kinds of land types (such as pits and ponds, weeds, woodlands, sporadic construction, etc.) in the map spots interpreted by the state according to images, and the field situation is complicated. How to investigate?

A: The information and scope of map classification provided by the state are for reference. According to the present situation on the spot, the map spots should be divided and the land types should be determined respectively if the minimum area on the map is reached; Those that do not reach the minimum area on the map are merged into adjacent land types. If it is inconsistent with the country’s interpretation of land types based on images, it is necessary to take photos and give evidence on the spot as required.

10. Can fields, roads, ditches, etc. be represented by single-line linear features?

A: No. Linear features such as roads, ditches and rivers that meet the above standards should be re-vectorized according to the field investigation results and image characteristics, and represented by patches. (Mi Fang Tan Wei)