当前位置：首页 > news >正文

珠海哪里学网站开发如何制作网站视频

news 2025/12/9 9:16:48

珠海哪里学网站开发,如何制作网站视频,创免费网站,做那种类型的网站seo好Yarn Yarn 是用来做分布式系统中的资源协调技术 MapReduce 1.x 对于 MapReduce 1.x 的版本上#xff1a; 由 Client 发起计算请求#xff0c;Job Tracker 接收请求之后分发给各个TaskTrack进行执行在这个阶段#xff0c;资源的管理与请求的计算是集成在 mapreduce 上的…Yarn Yarn 是用来做分布式系统中的资源协调技术 MapReduce 1.x 对于 MapReduce 1.x 的版本上由 Client 发起计算请求Job Tracker 接收请求之后分发给各个TaskTrack进行执行在这个阶段资源的管理与请求的计算是集成在 mapreduce 上的这种架构会导致 mapreduce 的功能过于臃肿也会衍生出一系列的问题。而 YARN 的出现及时的对这个问题作出了改变YARN 就类似于一个操作系统mapreduce 就类似于运行在 YARN 这个操作系统上的实际程序 YARN 同时也支持 Spark、Flink、Taz 等分布式计算技术这使得 YARN 进一步被发扬光大 YARN 基础 Yarn 的基础原理就是将资源管理与作业调度监视功能进行拆分由 ResourceManager 进行资源管理由 ApplicationMaster 进行作业调度与监视功能 ResourceManager NodeManager NodeManager … Client 将作业提交给 ResourceManager 进行资源调度 YARN 基础配置 ResourceManager 会将任务分配给一个个的 NodeManager 每个NodeManager 中都有一个个的Contaniner这一个个的 Container中就保存着一个个的任务的计算同时NodeManager 中也保存着一个个任务的 Application Master、每一个 Container 的计算完成后会汇报给 Application Master 其结束的信息Application Master 会及时向 Resource Manager 汇报其情况信息。修改 mapred-site.xml 在最后添加; configuration!-- 指定Mapreduce 的作业执行时用yarn进行资源调度 --propertynamemapreduce.framework.name/namevalueyarn/value/propertypropertynameyarn.app.mapreduce.am.env/namevalueHADOOP_MAPRED_HOME/export/server/hadoop-3.3.6/value/propertypropertynamemapreduce.map.env/namevalueHADOOP_MAPRED_HOME/export/server/hadoop-3.3.6/value/propertypropertynamemapreduce.reduce.env/namevalueHADOOP_MAPRED_HOME/export/server/hadoop-3.3.6/value/property /configuration修改 yarn-site.xml configuration!-- Site specific YARN configuration properties --!-- 设置ResourceManager --propertynameyarn.resourcemanager.hostname/namevaluenode1/value/property!--配置yarn的shuffle服务--propertynameyarn.nodemanager.aux-services/namevaluemapreduce_shuffle/value/propertypropertynameyarn.nodemanager.aux-services.mapreduce_shuffle.class/namevalueorg.apache.hadoop.mapred.ShuffleHandler/value/property /configuration 在 hadoop-env.sh 中添加 export HDFS_NAMENODE_USERroot export HDFS_DATANODE_USERroot export HDFS_SECONDARYNAMENODE_USERroot export YARN_RESOURCEMANAGER_USERroot export YARN_NODEMANAGER_USERroot之后将修改好的这几个文件分发给其他节点 scp mapred-site.xml yarn-site.xml hadoop-env.sh node2:$PWD scp mapred-site.xml yarn-site.xml hadoop-env.sh node3:$PWD之后就可以打开 yarn.sh start-yarn.sh之后在对应的 8088 端口就可以找到对应的可视化网页信息了进行词频统计wordcount的测试 hadoop jar /export/server/hadoop-3.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar wordcount /input /output1就可以顺利调用对应的东西了。另外如果我们希望查看日志信息还需要再 mapred-site.xml 中继续进行配置 propertynameMapReduce.jobhistory.address/namevaluenode1:10020/value/propertypropertynameMapReduce.jobhistory.webapp.address/namevaluenode1:19888/value/property继续在 yarn.xml 中配置 !-- 添加如下配置 --!-- 是否需要开启⽇志聚合 --!-- 开启⽇志聚合后将会将各个Container的⽇志保存在yarn.nodemanager.remote-app-logdir的位置 --!-- 默认保存在/tmp/logs --propertynameyarn.log-aggregation-enable/namevaluetrue/value/property!-- 历史⽇志在HDFS保存的时间单位是秒 --!-- 默认的是-1表示永久保存 --propertynameyarn.log-aggregation.retain-seconds/namevalue604800/value/propertypropertynameyarn.log.server.url/namevaluehttp://node1:19888/jobhistory/logs/value/propertyYARN 的任务提交流程 MapReduce 程序运行 Job 任务创建出一个 JobCommiter再由 JobCommiter 进行任务提交等工作。 JobCommiter 会将自己要执行的任务提交给 ResourceManager申请一个应用ID 若资源足够ResourceManager 会分配给 MapReduce 一个应用ID这个时候Mapreduce 会将自己的程序上传到 HDFS再由需要程序的节点下载对应的程序来进行运算 JobCommitter 正式向 ResourceManager 提交作业任务 ResourceManager 会找到一个负载较小的 NodeManager指定其完成这个任务这个被指定的 NodeManager 会创建一个 Container并创建一个 AppMaster 用来监控这个任务并调度资源这个 AppMaster 会从 HDFS 上接收对应的信息分片信息、任务程序对应每一个分片都对应一个 MapTask 在明确了需要的空间之后这个 AppMaster 会向 ResourceManager 申请资源ResourceManager 会根据 NodeManager 上的负载情况为其分配对应的 NodeManager 对应的 Node Manager 会向 HDFS 上下载对应的程序进行真正的任务计算并在执行的时候向 APPMaster 进行汇报例如在成功的时候向 appMaster 进行报告。 YARN 的命令查看当前在 yarn 中运行的任务 yarn top测试一下执行 hadoop jar /export/server/hadoop-3.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar pi 10 10使用 yarn application 可以查看运行的任务信息 yarn application -list -appStates ALL # 查看所有任务信息 yarn application -list -appStates FINISHED / RUNNING # 查看所有已完成的 / 正在运行的我们也可以根据查看出来的 APPID 直接杀死进程 yarn application -kill xxxxxxxxxxx被杀死的任务会被标记为 KILLED我们可以使用 yarn application -list -appStates KILLED 来进行查看 YARN 调度器先进先出调度器如题先来的先处理存在饥饿问题哪怕你只需要一毫秒的运行也需要一直等容量调度器会开辟两个空间其中 80% 的资源会用来像 FIFO 一样进行处理另有 20% 等待处理其他问题这样就在一定程度上规避了小型任务饥饿问题但其存在资源浪费问题因为可能有 20% 的资源始终没有使用公平调度器公平调度器会为每个任务分配相同的资源当有任务执行结束时其会将其所占有的资源分配给其他任务 YARN 的队列 YARN 默认使用的是只有一个队列的容量调度器其实也就是 FIFO但这里我们可以进行配置一般情况下会创建两个队列一个用来处理主任务另一个用来处理小任务这里需要修改配置文件修改 etc/hadoop 中的 capacity-scheduler.xml !--Licensed under the Apache License, Version 2.0 (the License);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an AS IS BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file. -- configuration!-- yarn 允许的提交任务的最大数量 --propertynameyarn.scheduler.capacity.maximum-applications/namevalue10000/valuedescriptionMaximum number of applications that can be pending and running./description/property!-- Application Master 允许占集群的资源比例0.1 代表 10% --propertynameyarn.scheduler.capacity.maximum-am-resource-percent/namevalue0.1/valuedescriptionMaximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent runningapplications./description/property!-- 队列类型 --propertynameyarn.scheduler.capacity.resource-calculator/namevalueorg.apache.hadoop.yarn.util.resource.DefaultResourceCalculator/valuedescriptionThe ResourceCalculator implementation to be used to compare Resources in the scheduler.The default i.e. DefaultResourceCalculator only uses Memory whileDominantResourceCalculator uses dominant-resource to compare multi-dimensional resources such as Memory, CPU etc./description/property!-- 默认只有一个 default 队列这里再添加一个 small 队列来处理小任务 --propertynameyarn.scheduler.capacity.root.queues/namevaluedefault,small/valuedescriptionThe queues at the this level (root is the root queue)./description/property!-- 这里是每个队列占用的资源百分比 --propertynameyarn.scheduler.capacity.root.default.capacity/namevalue70/valuedescriptionDefault queue target capacity./description/propertypropertynameyarn.scheduler.capacity.root.small.capacity/namevalue30/valuedescriptionDefault queue target capacity./description/property!-- 用户可以占用的资源的比例 --propertynameyarn.scheduler.capacity.root.default.user-limit-factor/namevalue1/valuedescriptionDefault queue user limit a percentage from 0.0 to 1.0./description/propertypropertynameyarn.scheduler.capacity.root.small.user-limit-factor/namevalue1/valuedescriptionDefault queue user limit a percentage from 0.0 to 1.0./description/property!-- 每个队列最多占用整体资源的比例 --propertynameyarn.scheduler.capacity.root.default.maximum-capacity/namevalue100/valuedescriptionThe maximum capacity of the default queue. /description/propertypropertynameyarn.scheduler.capacity.root.small.maximum-capacity/namevalue100/valuedescriptionThe maximum capacity of the default queue. /description/property!-- 队列的状态 RUNNING 表示该队列是启用的状态 --propertynameyarn.scheduler.capacity.root.default.state/namevalueRUNNING/valuedescriptionThe state of the default queue. State can be one of RUNNING or STOPPED./description/propertypropertynameyarn.scheduler.capacity.root.small.state/namevalueRUNNING/valuedescriptionThe state of the default queue. State can be one of RUNNING or STOPPED./description/property!-- 权限管理允许哪些用户向队列中提交任务 --propertynameyarn.scheduler.capacity.root.default.acl_submit_applications/namevalue*/valuedescriptionThe ACL of who can submit jobs to the default queue./description/propertypropertynameyarn.scheduler.capacity.root.small.acl_submit_applications/namevalue*/valuedescriptionThe ACL of who can submit jobs to the default queue./description/propertypropertynameyarn.scheduler.capacity.root.default.acl_administer_queue/namevalue*/valuedescriptionThe ACL of who can administer jobs on the default queue./description/propertypropertynameyarn.scheduler.capacity.root.small.acl_administer_queue/namevalue*/valuedescriptionThe ACL of who can administer jobs on the default queue./description/propertypropertynameyarn.scheduler.capacity.root.default.acl_application_max_priority/namevalue*/valuedescriptionThe ACL of who can submit applications with configured priority.For e.g, [user{name} group{name} max_priority{priority} default_priority{priority}]/description/propertypropertynameyarn.scheduler.capacity.root.small.acl_application_max_priority/namevalue*/valuedescriptionThe ACL of who can submit applications with configured priority.For e.g, [user{name} group{name} max_priority{priority} default_priority{priority}]/description/propertypropertynameyarn.scheduler.capacity.root.default.maximum-application-lifetime/namevalue-1/valuedescriptionMaximum lifetime of an application which is submitted to a queuein seconds. Any value less than or equal to zero will be considered asdisabled.This will be a hard time limit for all applications in thisqueue. If positive value is configured then any application submittedto this queue will be killed after exceeds the configured lifetime.User can also specify lifetime per application basis inapplication submission context. But user lifetime will beoverridden if it exceeds queue maximum lifetime. It is point-in-timeconfiguration.Note : Configuring too low value will result in killing applicationsooner. This feature is applicable only for leaf queue./description/propertypropertynameyarn.scheduler.capacity.root.small.maximum-application-lifetime/namevalue-1/valuedescriptionMaximum lifetime of an application which is submitted to a queuein seconds. Any value less than or equal to zero will be considered asdisabled.This will be a hard time limit for all applications in thisqueue. If positive value is configured then any application submittedto this queue will be killed after exceeds the configured lifetime.User can also specify lifetime per application basis inapplication submission context. But user lifetime will beoverridden if it exceeds queue maximum lifetime. It is point-in-timeconfiguration.Note : Configuring too low value will result in killing applicationsooner. This feature is applicable only for leaf queue./description/propertypropertynameyarn.scheduler.capacity.root.default.default-application-lifetime/namevalue-1/valuedescriptionDefault lifetime of an application which is submitted to a queuein seconds. Any value less than or equal to zero will be considered asdisabled.If the user has not submitted application with lifetime value then thisvalue will be taken. It is point-in-time configuration.Note : Default lifetime cant exceed maximum lifetime. This feature isapplicable only for leaf queue./description/propertypropertynameyarn.scheduler.capacity.root.small.default-application-lifetime/namevalue-1/valuedescriptionDefault lifetime of an application which is submitted to a queuein seconds. Any value less than or equal to zero will be considered asdisabled.If the user has not submitted application with lifetime value then thisvalue will be taken. It is point-in-time configuration.Note : Default lifetime cant exceed maximum lifetime. This feature isapplicable only for leaf queue./description/propertypropertynameyarn.scheduler.capacity.node-locality-delay/namevalue40/valuedescriptionNumber of missed scheduling opportunities after which the CapacityScheduler attempts to schedule rack-local containers.When setting this parameter, the size of the cluster should be taken into account.We use 40 as the default value, which is approximately the number of nodes in one rack.Note, if this value is -1, the locality constraint in the container requestwill be ignored, which disables the delay scheduling./description/propertypropertynameyarn.scheduler.capacity.rack-locality-additional-delay/namevalue-1/valuedescriptionNumber of additional missed scheduling opportunities over the node-locality-delayones, after which the CapacityScheduler attempts to schedule off-switch containers,instead of rack-local ones.Example: with node-locality-delay40 and rack-locality-delay20, the scheduler willattempt rack-local assignments after 40 missed opportunities, and off-switch assignmentsafter 402060 missed opportunities.When setting this parameter, the size of the cluster should be taken into account.We use -1 as the default value, which disables this feature. In this case, the numberof missed opportunities for assigning off-switch containers is calculated based onthe number of containers and unique locations specified in the resource request,as well as the size of the cluster./description/propertypropertynameyarn.scheduler.capacity.queue-mappings/namevalue/valuedescriptionA list of mappings that will be used to assign jobs to queuesThe syntax for this list is [u|g]:[name]:[queue_name][,next mapping]*Typically this list will be used to map users to queues,for example, u:%user:%user maps all users to queues with the same nameas the user./description/propertypropertynameyarn.scheduler.capacity.queue-mappings-override.enable/namevaluefalse/valuedescriptionIf a queue mapping is present, will it override the value specifiedby the user? This can be used by administrators to place jobs in queuesthat are different than the one specified by the user.The default is false./description/propertypropertynameyarn.scheduler.capacity.per-node-heartbeat.maximum-offswitch-assignments/namevalue1/valuedescriptionControls the number of OFF_SWITCH assignments allowedduring a nodes heartbeat. Increasing this value can improvescheduling rate for OFF_SWITCH containers. Lower values reduceclumping of applications on particular nodes. The default is 1.Legal values are 1-MAX_INT. This config is refreshable./description/propertypropertynameyarn.scheduler.capacity.application.fail-fast/namevaluefalse/valuedescriptionWhether RM should fail during recovery if previous applicationsqueue is no longer valid./description/propertypropertynameyarn.scheduler.capacity.workflow-priority-mappings/namevalue/valuedescriptionA list of mappings that will be used to override application priority.The syntax for this list is[workflowId]:[full_queue_name]:[priority][,next mapping]*where an application submitted (or mapped to) queue full_queue_nameand workflowId workflowId (as specified in application submissioncontext) will be given priority priority./description/propertypropertynameyarn.scheduler.capacity.workflow-priority-mappings-override.enable/namevaluefalse/valuedescriptionIf a priority mapping is present, will it override the value specifiedby the user? This can be used by administrators to give applications apriority that is different than the one specified by the user.The default is false./description/property/configuration 若我们在提交任务时不指定队列默认会被提交到 default 队列里另外我们也可以指定我们将队列提交到哪里 # 默认 hadoop jar /export/server/hadoop-3.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar pi 10 10# 指定队列 hadoop jar /export/server/hadoop-3.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar wordcount -Dmapreduce.job.queuenamesmall /input /output3另外如果我们希望修改默认提交到的队列需要在 mapred-site.xml 文件中添加如下配置 propertynamemapreduce.job.queuename/namevaluesmall/value /property

查看全文

http://www.dnsts.com.cn/news/203833.html